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In the Title ; 

Please delete the title and insert therefor: -METHODS OF TREATING 
HYPERPROLIFERATIVE DISORDERS USING RETINOBLASTOMA FUSION PROTEINS- 

In the Specification : 

At page 2, line 5, please delete "518-521" and insert -5 18-522-. 

At page 3, line 17, after "Figure 1A" insert ~(SEQ ID NO:l)~. 

At page 3, line 19, after "Figure IB" insert ~(SEQ ID NO:2)~. 

At page 3, line 21, after "Figure 2A" insert ~(SEQ ID NO:3)-. 

At page 3, line 23, after "Figure 2B" insert ~(SEQ ID NO:4)-. 

At page 3, line 26, after "Figure 4" insert ~(SEQ ID NOS:5-18)-. 

At page 3, line 32, after "Figure 8" insert -(SEQ ID NOS:33-46)~. 

At page 3, line 19, after "Figure 1 A" insert -(SEQ ID NO:2)-. 

At page 10, line 9, please delete "Hartzoglou" and insert -Hatzoglou--. 

At page 11, at line 18, please delete "1098" and insert -1089--. At line 19, please 
delete "Casper" and insert -Kasper-. At line 24, please delete "Tanura" and insert -Tamura-. 

At page 19, line 12, please delete "11" and insert - 111--. At line 20, please delete 
"Willart" and insert -Willard-. At line 21, please delete "1995" and insert -1994-. 

At page 27, line 6, please delete "Thimmappaaya" and insert -Thimmappaya--. 

After page 31, please insert the enclosed paper copy of the sequence listing, which 
consists of pages 32-62. Please renumber the following pages accordingly. 



In the Claims : 

Please cancel claims 1-15. 

Please amend claims 16, 17, 22-26 and 32-33 as follows: 
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16. (Once amended) A method for treatment of a hyperproliferative disorder in a 
patien t, the method comprising administering to a patient a therapeutically effective dose of a 
fusion polypeptide that comprises [comprising a fusion of] a DNA binding domain of a 
transcription factor and a functional growth suppression domain of a [ ? the transcription factor 
comprising a DNA binding domain, and a] retinoblastoma (RB) polypeptide [, the RB 
polypeptide comprising a growth suppression domain]. 

17. (Once amended) The method of claim 1 6, wherein the fusion polypeptide 
[protein] is encoded by a nucleic acid delivered to the patient. 

18. (Reiterated) The method of claim 16 ? wherein the transcription factor is E2F. 

1 9. (Reiterated) The method of claim 18, wherein the cyclin A binding domain 
of the E2F is deleted or nonfunctional. 

20. (Reiterated) The method of claim 1 6, wherein the RB is RB56. 

2 1 . (Reiterated) The method of claim 1 6, wherein the RB is wild type RB56. 

22. (Once amended) The method of claim 1 6, wherein the functional growth 
suppression domain of the RB polypeptide comprises from about amino acid residue 379 to 
about amino acid residue 928 (SEQ ID NO:4) . 
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23 . (Once amended) The method of claim 1 6, wherein the functional growth 
suppression domain of the RB polypeptide comprises at least one substitution of amino acid 
residues selected from the group consisting of 2, 608, 612, 788, 807, and 811. 

24. (Once amended) The method of claim 1 8, wherein the E2F polypeptide 
comprises about amino acid residues 95 to about 286 (SEQIDNO:!) . 

25. (Once amended) The method of claim 18, wherein the E2F polypeptide 
comprises about amino acid residues 95 to about 194 (SEQIDNO:!) . 

26. (Once amended) The method of claim 1 6, wherein the fusion polypeptide 
comprises EF2 amino acid residues from about 95 to about 194 (SEQIDNO:!) operatively 

*3| linked to RB amino acid residues from about 379 to about 928 (SEQ ID NO:4) . 

27. (Reiterated) The method of claim 1 8, wherein the E2F-RB fusion 
polypeptide is expressed under the control of a tissue-specific promoter. 

28. (Reiterated) The method of claim 27, wherein the tissue specific promoter is 
a smooth muscle actin promoter. 

29. (Reiterated) The method of claim 1 6, wherein the hyperproliferative 
disorder is cancer. 

30. (Reiterated) The method of claim 29, wherein the cancer is bladder cancer. 
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3 1 . (Reiterated) The method of claim 29, wherein the hyperproliferative 
disorder is restenosis. 

32. (Once amended) The method of claim 3 1 , wherein the [E2F-RB] fusion 
polypeptide is administered after angioplasty. 

33. (Once amended) The method of claim 32, wherein the [E2F-RB] fusion 
polypeptide is administered as a coating on an angioplasty device. 

34. (Reiterated) The method of claim 17, wherein the nucleic acid is 
administered after angioplasty. 

35. (Reiterated) The method of claim 17, wherein the nucleic acid is 
administered as a coating on an angioplasty device. 

36. (Reiterated) The method of claim 17, wherein the nucleic acid is inserted in 
an adenovirus vector. 



Please add the following new claim 37. 

37. The method of claim 16, wherein the fusion polypeptide lacks a functional 
cyclin A-kinase binding domain of the transcription factor. 
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REMARKS 

Status of the Application 

Claims 16-37 are pending with entry of this amendment, with claims 16-36 

previously in the application and entry of new claim 37 respectfully requested. 
The Amendments 

The amendments to page 3 of the specification provide the sequence ID numbers 
for the various amino acid and nucleotide sequences. The remaining amendments to the 
specification correct various errors in the citations of the listed references. 

The Sequence Listing 

The paper copy of the Sequence Listing in this application is identical to the 
computer readable copy of the Sequence Listing filed in Application No. 08/801,092, filed 
February 14, 1997. In accordance with 37 CFR § 1.821(e), please use the only computer readable 
form filed in that application (filed May 12, 1997) as the computer readable form for the instant 
application. It is understood that the Patent and Trademark Office will make the necessary 
change in application number and filing date for the instant application. A paper copy of the 
Sequence Listing is enclosed herewith for incorporation into the specification. Applicants attest 
that the information contained in the Sequence Listing introduces no new matter and that the 
computer-readable form submitted herewith is the same as the paper copy of the Sequence 
Listing. 

CONCLUSION 

In view of the foregoing, Applicants believe that all claims now pending in this 
application are in condition for allowance. The issuance of a formal Notice of Allowance at an 
early date is respectfully requested. 
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If a telephone conference would expedite prosecution of this application, the 
Examiner is invited to telephone the undersigned attorney at (415) 576-0200. 

Respectfully submitted, 

Timothy L. Smith, Ph.D. 
Reg. No. 35,367 

TOWNSEND and TOWNSEND and CREW, LLP 

Two Embarcadero Center, 8th Floor 

San Francisco, California 94111-3834 

(415) 576-0200 

(415) 576-0300 (facsimile) 

: : odma\pcdocs\sf\22 1927N1 
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5 TISSUE SPECIFIC EXPRESSION OF RETINOBLASTOMA 

PROTEIN 

BACKGROUND OF THE INVENTION 

10 Both the retinoblastoma gene (RB) and transcription 

factor E2F play a critical role in cell growth control (for a 
review, see Adams, P. & Kaelin, W. Seminars in Cancer Biology 
6:99-108' (1995)). The RB locus is frequently inactivated in a 
variety of human tumor cells. Reintroduction of a wild-type 

15 RB gene (e.g., Bookstein et al . Science 247:712-715 (1990)) or 
RB protein (pRB) (e.g., Antelman et al . Oncogene 10:697- 
704 (1995) ) into RBneg/RBmut cells can suppress growth in 
culture and tumorigenicity in vivo. 

While E2F serves to activate transcription of S- 

2 0 phase genes, its activity is kept in check by RB. RB arrests 

cells by blocking exit from G into S -phase (for example, Dowdy 
et al. Cell 73:499-511 (1993)) but the precise pathway of the 
arrest remains unclear. 

Although E2F forms complexes with RB, complex 
25 formation is more efficient if an E2F-related protein, DP-1, 
is present. E2F-1 and DP-1 form stable heterodimers which 
bind to DNA (for example, Qin et al . Genes and Dev. 6-: 953 -964 
(1992) ) . DP-1-E2F complexes serve to cooperatively activate 
transcription of E2F-dependent genes. Such transcription can 

3 0 be repressed by pRB in the same manner as E2F-1 or DP-1 

activated transcription. 

Transcriptional repression of genes by RB in some 
instances can be achieved by tethering pRB to a promoter. For 
example, GAL4 -pRB fusions bind to GAL4 DNA binding domains and 
35 repress transcription from p53, Sp-1 or AP-1 elements (Adnane, 
et al. J. Biol. Chem. 270:8837-8843 (1995); Weintraub, et al - 
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Nature 358:259-261 (1995)). Sellers, et al. f Proc. Natl. 

Acad. Sci. 92:11544-11548 (1995)) disclosed fusions of amino 

acid residues 1-368 of E2F with amino acids 379-792 or 379-928 
of RB. 

5 Chang, et al. ( Science 267:518-521 (1995)) disclosed 

the use of a replication-defective adenovirus -RB construct in 
the reduction of neointima formation in two animal models of 
restenosis, a hyperprolif erative disorders. 

10 SUMMARY OF THE INVENTION 

The instant invention provides the surprising result 
that a fusion of an E2F polypeptide with an RB polypeptide is 
more efficient in repressing transcription of the E2F promoter 
than RB alone, and that such fusions can cause cell cycle 

15 arrest in a variety of cell types. Such fusions can thus 
address the urgent need for therapy of hyperprolif erative 
disorders, including cancer. 

One aspect of the invention is a polypeptide 
comprising a fusion of a transcription factor, the 

20 transcription factor comprising a DNA binding domain, and a 

retinoblastoma (RB) polypeptide, the RB polypeptide comprising 
a growth suppression domain. Another aspect of the invention 
is DNA encoding such a fusion polypeptide. The DNA can be 
inserted in an adenovirus vector. 

25 In some embodiments of the invention, the 

transcription factor is E2F. The cyclin A binding domain of 
the E2F can be deleted or nonfunctional. The E2F can comprise 
amino acid residues about 95 to about 194 or about 95 to about 
28 6 in some embodiments. 

3 0 The retinoblastoma polypeptide can be wild- type RB, 

RB56, or a variant or fragment thereof. In some embodiments, 
the retinoblastoma polypeptide comprises amino acid residues 
of about 379 to about 928. Preferred amino acid substitutions 
of the RB polypeptide include residues 2, 608, 788, 807, and 

35 811. 

Another aspect of the invention is an expression 
vector comprising DNA encoding a polypeptide, the polypeptide 
comprising a fusion of a transcription factor, the 
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transcription factor comprising a DNA binding domain, and a 
retinoblastoma (RB) polypeptide, the RB polypeptide comprising 
a growth suppression domain. In some embodiments a tissue- 
specific promoter is operatively linked to DNA encoding the 
5 fusion polypeptide. The tissue- specific promoter can be a 
smooth muscle alpha actin promoter. 

Another aspect of the invention is a method for 
treatment of hyperprolif erative disorders comprising 
administering to a patient a therapeutically effective dose of 
10 an S2F-RB fusion polypeptide. The hyperprolif erative disorder 
can be cancer. In some embodiments the hyperprolif erative 
disorder is restenosis. The fusion polypeptide and nucleic 
acid encoding the fusion polypeptide can be used to coat 
devices used for angioplasty. 

15 

BRIEF DESCRIPTION OF THE DRAWINGS 

Figure 1A depicts the predicted amino acid sequence 

of E2F. 

Figure IB depicts the nucleotide sequence of 
20 transcription factor E2F. 

Figure 2A depicts the nucleotide sequence of pRB as 
disclosed by Lee, et al. ( Nature 329:642-645 (1987). 

Figure 2B depicts the predicted amino acid sequence 

of pRB. 

25 Figure 3 is a diagrammatic representation of pCTM. 

Figure 4 depicts the nucleotide sequence of plasmid 

pCTM. 

Figure 5 is a diagrammatic representation of pCTMI . 
Figure 6 depicts the nucleotide sequence of pCTMI . 
3 0 Figure 7 is a diagrammatic representation of plasmid 

pCTMIE . 

Figure 8 depicts the nucleotide sequence of pCTMIE. 

Figure 9 is a diagram depicting E2F-RB fusion 
constructs used in the examples. All E2F constructs commenced 
3 5 at amino acid 95 and lacked part of the cyclin A binding 

domain. E2F-437 contained the DNA binding domain (black) , 
heterodimerization domain (white) , and the transactivation 
domain (stippled) . E2F-194 contained solely the DNA binding 
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domain. E2F-28S contained the DNA binding domain and the DP-1 
heterodimeri2ation domain. To generate E2F-194-RB56-5S and 
E2F-286-RB56-5S, the E2F constructs were fused in-frame to 
codon 379 of RB . C706F is an inactivating point mutation. 
5 Figure 10 is a diagram depicting transcriptional 

repression by E2F-RB fusion constructs. 

Figure 11 (A-D) depicts expression of E2F-RB fusion 
proteins in mammalian cell lines. Extracts were prepared from 
cells used in E2-CAT reporter assays or in FACS assays and 

10 analyzed with an anti-RB monoclonal antibody. In panel A, the 
results are shown from C33A cells transfected with (3) RB56- 
H209, (4) RB56 wild-type, (5) RB56-5s, (6) E2F286~5s, (7) 
E2F194-5s, (8) E2F194, (9) E2F286, (10) E2F437. Lane (1) is 
an RB56 protein standard. Lane (2) is a mock transf ection . 

15 In panel B, results are shown for transfection of Saos-2 cells 
with (1) RB56, (2,3) E2F194-5s, and (4,5) E2F286-5S. In panel 
C, results are shown for transfection of 5637 cells with (2,3) 
RB56 wild-type, (4,5) RB56-5S; (6,7) E2F194-5S; (7,8) E2F286- 
5S. Lane (1) is an RB5 6 protein standard. In panel D, 

20 results are shown for NIH-3T3 transfected (3) RB56, (4) 

E2F286-5s, (5) E2F194-5s. Lane (1) is an RB56 standard; lane 
(2) is an RB110 standard. 

Figure 12 depicts histogram analyses of flow 
cytometry of RB-expressing NIH-3T3 cells. 

25 Figure 13, panel A, depicts a comparison of the 

effects of a CMV-driven recombinant adenovirus (ACN56) with 
two isolates of a human smooth muscle alpha actin-driven E2F- 
p56 fusion construct consisting of amino acids 95 through 286 
of E2F linked directly and in-frame to p56 (amino acids 379- 

30 928 of RB cDNA) , vs. a control virus (ACN) in a 3 H-thymidine 
uptake assay in the rat smooth muscle cell line A7R5 . Panel 
(B) depicts the effects of the same constructs in the rat 
smooth muscle cell line A10 . 

Figure 14 depicts a comparison of the effects of the 

35 viruses described in Fig. 13 in non-muscle cells. Panel (A) 
depicts results in the breast carcinoma cell line MDA MB468. 
Panel (B) depicts results in the non-small cell lung cell 
carcinoma line H3 58, 
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Figure 15, top panel, depicts the relative 
inf activity by adenovirus of different cell lines as judged by 
the level of p-galactosidase {(3-gal} staining following 
infection with equal amounts of a recombinant adenovirus 
5 expressing (3 -gal driven by a CMV promoter. H358 is non-small 
lung cell carcinoma cell line; MB4 6 8 is a breast carcinoma 
cell line; A7R5 and A10 are smooth muscle cell lines. The 
lower portion of the figure depicts the relative levels of p56 
protein expressed in the same cells when infected with the 

10 recombinant adenovirus ACN56, in which the p56 cDNA is driven 
by the non-tissue specific CMV promoter. 

Figure 16 depicts relative protein levels in cells 
infected with the smooth muscle alpha actin promoter-driven 
E2F-p56 fusion construct (ASN286-56) . UN denoted uninfected; 

15 50, 100, 250, and 500 refer to multiplicities of infection 
(MOD . 

Figure 17 is a bar graph depicting the ratio of 
intima to media area (as a measurement of the inhibition of 
neointima formation) from cross-sections (n=9) of rat carotid 

20 arteries which were injured and treated with recombinant 
adenoviruses expressing either [3-gal, RB (ACNRB) or p56 
(ACN5 6 } , all under the control of the CMV promoter. 

Figure 18 is a series of three photographs depicting 
restenosis in a rat angioplasty model. The panel on the left 

2 5 depicts data from a normal animal; the central panel depicts 
data from an animal injured and then treated with a (3 -gal 
expressing recombinant virus; the panel on the right depicts 
data from an animal injured and then treated with a 
recombinant adenovirus expressing p56 (ACN56) * 

30 Figure 19 depicts tissue-specificity of the smooth 

muscle alpha actin promoter, as demonstrated by its selective 
ability to express the (3 -gal transgene in muscle cells but not 
non-muscle cells. The panels on the left compare (3 -gal 
expression in the breast cell carcinoma line MB468 infected 

35 with either an MOI=l with a CMV-driven (3-gal (ACNBGAL) vs an 
MOI= 100 with the smooth muscle promoter construct (ASNBGAL) , 
The panels on the right show (3-gal expression of the rat 
smooth muscle cell line A7R5 infected with either an MOI=i of 



6 



ACNBGAL or an MOI=50 of ASNBGAL. Expression from ASNBGAL is 
seen in the muscle cell line, but is absent in the non-muscle 
cell line, despite the higher degree of infectivity of the 
cells . 

5 Figure 20 depicts the ability of recombinant 

adenovirus expressing RB to transduce rat carotid arteries, 
recombinant adenovirus -treated arteries (IX 10 9 pfu) were 
harvested two days following balloon injury and infection. 
Cxoss sections were fixed and an RB specific antibody was used 

10 to detect the presence of RB protein in the tissue. The 

control virus used was ACN. RB protein staining was evident 
in the ACNRB treated sample, especially at higher 
magnifications . 

Figure 21 depicts a comparison of the effects of a 

15 CMV-driven p56 recombinant adenovirus (ACN56E4) vs a human 
smooth muscle alpha-actin promoter-driven E2F-p56 fusion 
construct (ASN286-56) vs control adenoviral constructs 
containing either the CMV or smooth muscle alpha-actin 
promoters without a downstream transgene (ACNE3 or ASBE3-2 

20 isolates shown, respectively) . Assays were 3 H-thymidine 

uptake either in a smooth muscle cell line (A7R5) or a non- 
muscle cell line (MDA-MB46S , breast carcinoma) . Results 
demonstrated muscle tissue specificity using the smooth muscle 
alpha-actin promoter and specific inhibition by both the p56 

25 and E2F-p56 transgenes relative to their respective controls. 

DESCRIPTION OF THE PREFERRED EMBODIMENT 

The instant invention provides RB fusion constructs 
3 0 including fusion polypeptides and vectors encoding them, and 
methods for the use of such constructs in the treatment of 
hyperprolif erative diseases. In some preferred embodiments of 
the invention, an RB polypeptide is fused to an E2F 
polypeptide. Any E2F species can be used, typically E2F-1, - 
35 2, -3, -3, or -5 (see, e.g., Wu et al. Mol Cell. Biol.. 

15:2536-2546 (1995); Ivey-Hoyle et al . Mol. Cell. Biol. 
13:7802 (1993); Vairo et al . Genes and Dev. 9:869 (1995); 
Beijersbergen et al . Genes and Dev. 8:2680 (1994)); Ginsberg 
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et al. Genes and Dev. 8:2665 (1994); Buck et ai . Oncogene 
11:31 (1995)), more typically E2F-1. Typically, the EF2 
polypeptide comprises at least the DNA binding domain of E2F, 
and may optionally include the cyclin A binding domain, the 
5 heterodimerization domain, and/or the transactivation domain. 
Preferably, the cyclin A binding domain is not functional. 
The nucleotide and amino acid sequence of E2F referred to 
herein are those of Genbank HUME2F, shown in Figure 1A and IB. 
Nucleic acid, preferably DNA, encoding such an EF2 polypeptide 

10 is fused in reading frame to an RB polypeptide. The RB 

polypeptide can be any RB polypeptide, including conservative 
amino acid, variants, allelic variants, amino acid 
substitution, deletion, or insertion mutants, or fragments 
thereof. Preferably, the growth suppression domain, i.e., 

15 amino acids residues 379-928, of the RB polypeptide is 

functional {Hiebert, et al . MCB 13:3384-3391 (1993); Qin, et 

al. Genes and Dev. 6:953-964 (1992)). In some embodiments, 

wild- type pRBHO is used. More preferably, a truncated 
version of RB, RB56, is used. RB5 6 comprises amino acid 
20 residues 379-928 of pRBHO (Hiebert, et al . MCB 13:3384-3391 

(1993); Qin, et al . Genes and Dev. 6:953-964 (1992)). In some 

embodiments, amino acid variants of RB at positions 2, 608, 
612, 788, 807, or 811, are used singly or in combination. The 
variant RB56-5S comprises wild-type RB56 having alanine 
25 substitutions at 608, 612, 788, 807 , and 811. Numbering of RB 
amino acids and nucleotides is according to the RB sequence 
disclosed by Lee, et al . (Natyire 329:642-645 (1987)), hereby 

incorporated by reference in its entirety for all purposes. 
(Figure 2) . 

3 0 Nucleic acids encoding the polypeptides of the 

invention can be DNA or RNA. The phrase "nucleic acid 
sequence encoding" refers to a nucleic acid which directs the 
expression of a specific protein or peptide. The nucleic acid 
sequences include both the DNA strand sequence that is 

3 5 transcribed into RNA and the RNA sequence that is translated 
into protein. The nucleic acid sequences include both the 
full length nucleic acid sequences as well as non-full length 
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sequences derived from the full length protein. It is further 
understood that the sequence includes the degenerate codons of 
the native sequence or sequences which may be introduced to 
provide codon preference in a specific host cell. 

The term "vector" as used herein refers to viral 
expression systems, autonomous self -replicating circular DNA 
(plasmids) , and includes both expression and nonexpression 
plasmids . Where a recombinant microorganism or cell culture 
is described as hosting an "expression vector," this includes 
both extrachromosomal circular DNA and DNA that has been 
incorporated into the host chromosome (s) . Where a vector is 
being maintained by a host cell, the vector may either be 
stably replicated by the cells during mitosis as an autonomous 
structure, or is incorporated within the host's genome. A 
vector contains multiple genetic elements positionally and 
sequentially oriented, i.e., operatively linked with other 
necessary elements such that nucleic acid in the vector 
encoding the constructs of the invention can be transcribed, 
and when necessary, translated in transfected cells. 

The term "gene" as used herein is intended to refer 
to a nucleic acid sequence which encodes a polypeptide. This 
definition includes various sequence polymorphisms, mutations, 
and/or sequence variants wherein such alterations do not 
affect the function of the gene product. The term "gene" is 
intended to include not only coding sequences but also 
regulatory regions such as promoters, enhancers, and 
termination regions . The term further includes all introns 
and other DNA sequences spliced from the mRNA transcript, 
along with variants resulting from alternative splice sites. 

The term "plasmid" refers to an autonomous circular 
DNA molecule capable of replication in a cell, and includes 
both the expression and nonexpression types. Where a 
recombinant microorganism or cell culture is described as 
hosting an "expression plasmid", this includes both 
extrachromosomal circular DNA molecules and DNA that has been 
incorporated into the host chromosome ( s ) . Where a plasmid is 
being maintained by a host cell, the plasmid is either being 
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stably replicated by the cells during mitosis as an autonomous 
structure or is incorporated within the host's genome. 

The phrase "recombinant protein" or 11 recombinant ly 
produced protein" refers to a peptide or protein produced 
5 using non-native cells that do not have an endogenous copy of 
DNA able to express the protein. The cells produce the 
protein because they have been genetically altered by the 
introduction of the appropriate nucleic acid sequence. The 
recombinant protein will not be found in association with 

10 proteins and other subcellular components normally associated 
with the cells producing the protein. The terms "protein" and 
"polypeptide" are used interchangeably herein. 

In general, a construct of the invention is provided 
in an expression vector comprising the following elements 

15 linked sequentially at appropriate distances for functional 

expression: a tissue-specific promoter, an initiation site for 
transcription, a 3' untranslated region, a 5 1 mRNA leader 
sequence, a nucleic acid sequence encoding a polypeptide of 
the invention, and a polyadenylation signal. Such linkage is 

20 termed "operatively linked." Enhancer sequences and other 
sequences aiding expression and/or secretion can also be 
included in the expression vector. Additional genes, such as 
those encoding drug resistance, can be included to allow 
selection or screening for the presence of the recombinant 

25 vector. Such additional genes can include, for example, genes 
encoding neomycin resistance, mult i -drug resistance, thymidine 
kinase, beta-galactosidase , dihydrof olate reductase (DHFR) , 
and chloramphenicol acetyl transferase. 

In the instant invention, tissue-specific expression 

3 0 of the RB constructs of the invention is preferably 

accomplished by the use of a promoter preferentially used by a 
tissue of interest. Examples of tissue-specific promoters 
include the promoter for creatine kinase, which has been used 
to direct the expression of dystrophin cDNA expression in 

35 muscle and cardiac tissue (Cox, et al. Nature 364:725-729 

(1993)) -and immunoglobulin heavy or light chain promoters for 
the expression of suicide genes in B cells (Maxwell, et al . 
Cancer Res. 51:4299-4304 (1991)). An endothelial cell- 
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specific regulatory region has also been characterised 
(Jahroudi, et al. Moh Cell. Biol. 14:999-1008 (1994)). 

Amphotrophic retroviral vectors have been constructed carrying 
a herpes simplex virus thymidine kinase gene under the control 
5 of either the albumin or alpha- fetoprotein promoters (Huber, 
et al. Proc. Natl. Acad. Sci. U.S.A. 88:8039-8043 (1991)) to 

target cells of liver lineage and hepatoma cells, 
respectively. Such tissue specific promoters can be used in 
retroviral vectors (Hartzoglou, et al. J. Biol. Chem. 
10 265:17285-17293 (1990)) and adenovirus vectors (Friedman, et 

al. Mol. Cell. Biol. 6:3791-3797 (1986); Wills et al . Cfrnce r 

Ge^e Therapy 3:191-197 (1995)) and still retain their tissue 
specificity. 

In the instant invention, a preferred promoter for 
15 tissue-specific expression of exogenous genes is the human 
smooth muscle alpha-actin promoter. Reddy, et al. ( J. Cell 
Biology 265:1683-1687 (1990)) disclosed the isolation and 

nucleotide sequence of this promoter, while Nakano, et al . 
( Gene 99:285-289 (1991)) disclosed transcriptional regulatory 

20 elements in the 5 ! upstream and the first intron regions of 
the human smooth muscle (aortic type) alpha-actin gene. 

Petropoulos, et al . ( J. Virol. 66:3391-3397 (1992)) 

disclosed a comparison of expression of bacterial 
chloramphenicol transferase (CAT) operatively linked to either 
25 the chicken skeletal muscle alpha actin promoter or the 
cytoplasmic beta-actin promoter. These constructs were 
provided in a retroviral vector and used to infect chicken 
eggs . 

Exemplary tissue-specific expression elements for 
3 0 the liver include but are not limited to HMG-CoA reductase 
promoter (Luskey, Mol. Cell. Biol. 7 (5) : 1881-1893 (1987)); 

sterol regulatory element 1 (SRE-1; Smith et al . J. Biol. 
Chem. 265 (4) :2306-2310 (1990); phosphoenol pyruvate carboxy 

kinase - (PEPCK) promoter (Eisenberger et al . Mol. Ce ll Biol. 
35 12 (3) :1396-1403 (1992)); human C-reactive protein (CRP) 

promoter (Li et al . J. Biol. Chem. 265 (7) :4136-4142 (1990)); 
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human glucokinase promoter (Tanizawa ec al . Mol . Endocrinology 
5(7}:1070-81 (1992); cholesterol 7 -alpha hydroylase (CYP-7) 

promoter (Lee et al. J. Biol. Chem. 269 (20) : 14681-9 (1994)); 

beta-galactosidase alpha-2,6 sialyltransf erase promoter 
5 (Svensson et al . J. Biol. Chem. 265 (34) : 20863 -8 (1990); 

insulin- like growth factor binding protein (IGFBP-1) promoter 
(Babajko et al. Biochem Biophys. Res, Comm. 196 (1) :480-6 

(1993) ); aldolase B promoter (Bingle et al . Biochem J. 

294 (Pt2) :473-9 (1993)); human transferrin promoter (Mendelzon 

10 et al. Nucl. Acids Res. 18 (19) : 5717-21 (1990); collagen type 1 
promoter (Houglum et al . J. Clin. Invest. 94(2):808-14 

(1994) ) . 

Exemplary tissue-specific expression elements for 
the prostate include but are not limited to the prostatic acid 
15 phosphatase (PAP) promoter (Banas et al. Biochim. Biophys. 

Acta. 1217 (2) :188-94 (1994); prostatic secretory protein of 94 

(PSP 94) promoter (Nolet et al. Biochim . Biophys . ACTA 
1098 (2) :247-9 (1991)); prostate specific antigen complex 

promoter (Casper et al. J. Steroid Biochem. Mol. Biol. 47 (1- 

20 6) :127-35 (1993)); human glandular kallikrein gene promoter 

(hgt-1) (Lilja et al. World J. Urology 11(4):188-91 (1993). 

Exemplary tissue-specific expression elements for 
gastric tissue include but are not limited to the human H + /K + - 
ATPase alpha subunit promoter (Tanura et al . FEBS Letters 
25 298: (2-3) :137-41 (1992)). 

Exemplary tissue-specific expression elements for 
the pancreas include but are not limited to pancreatitis 
associated protein promoter (PAP) (Dusetti et al . J. Biol. 
Chem. 268 (19) :14470-5 (1993)); elastase 1 transcriptional 

30 enhancer (Kruse et al . Genes and Dev elopment 7(5):774-86 

(1993)); pancreas specific amylase and elastase enhancer 
promoter (Wu et al . Mol. Cel l, Biol. 11 (9) : 4423-30 (1991); 

Keller et al. Genes & Dev. 4(8):1316-21 (1990)); pancreatic 

cholesterol esterase gene promoter (Fontaine et al. 
35 Biochemistry 30 (28) :7008-14 (1991) ) . 
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Exemplary tissue-specific expression elements for 
the endometrium include but are not limited to the uteroglobin 
promoter (Kelftenbein et al . Annal . NY Acad. Sci . 622:69-79 

(1991) ) . 

5 Exemplary tissue-specific expression elements for 

adrenal cells include but are not limited to cholesterol side- 
chain cleavage (SCO promoter (Rice et al . J . Biol. Chem. 
265:11713-20 (1990) . 

Exemplary tissue-specific expression elements for 
10 the general nervous system include but are not limited to 

gamma-gamma enolase (neuron-specific enolase, NSE) promoter 
(Forss-Petter et al. Neuron 5{2):187-97 (1990)). 

Exemplary tissue-specific expression elements for 
the brain include but are not limited to the neurofilament 
15 heavy chain (NF-H) promoter (Schwartz et al. J. Biol. Chem. 
269(18) :13444-50 (1994) ) . 

Exemplary tissue-specific expression elements for 
lymphocytes include but are not limited to the human CGL- 
1/granzyme B promoter (Hanson et al . J. Biol. Chem. 266 

20 (36):24433-8 (1991)); the terminal deoxy transferase (TdT) , 

lambda 5, VpreB, and lck (lymphocyte specific tyrosine protein 

kinase p561ck) promoter (Lo et al . Mol. Cell. Biol. 

11 (10} : 5229-43 (1991)); the humans CD2 promoter and its 

3 ' transcriptional enhancer (Lake et al . EMBO J. 9 (10) : 3129 -3 6 

25 (1990) ) , and the human NK and T cell specific activation 

(NKG5) promoter (Houchins et al . Immunogenetics 37(2):102-7 

(1993) ) . 

Exemplary tissue-specific expression elements for 
the colon include but are not limited to pp60c-src tyrosine 
30 kinase promoter (Talamonti et al . J. Clin. Invest 91(l):53-60 

(1993)); organ-specific neoantigens (OSNs) , mw 40kDa (p40) 
promoter (Ilantzis et al . Microbiol . Immunol . 37{2):119-28 

(1993)); colon specific antigen-P promoter (Sharkey et al. 
Cancer '73 (3 supp.) 864-77 (1994)). 

35 Exemplary tissue- specif ic expression elements for 

breast cells include but are not limited to the human alpha- 
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lactalbumin promoter (Thean et al. British J. Cancer, 
61(5) :773-5 (1990) ) . 

Other elements aiding specificity of expression in a 
tissue of interest can include secretion leader sequences, 
5 enhancers, nuclear localization signals, endosmolytic 

peptides, etc. Preferably, these elements are derived from 
the tissue of interest to aid specificity. 

Techniques for nucleic acid manipulation of the 
nucleic acid sequences of the invention such as subcloning 
10 nucleic acid sequences encoding polypeptides into expression 

vectors, labelling probes, DNA hybridization, and the like are 
described generally in Sambrook et al * , Molecular Cloning - A 

l a borato ry Man u al (2nd Ed.), Vol. 1-3, Cold Spring Harbor 
Laboratory, Cold Spring Harbor, New York, (1989) , which is 
15 incorporated herein by reference. This manual is hereinafter 
referred to as "Sambrook efc al." 

Once DNA encoding a sequence of interest is isolated 
and cloned, one can express the encoded proteins in a variety 
of recombinant ly engineered cells. It is expected that those 

2 0 of skill in the art are knowledgeable in the numerous 

expression systems available for expression of DNA encoding* 
No attempt to describe in detail the various methods known for 
the expression of proteins in prokaryotes or eukaryotes is 
made here . 

25 In brief summary, the expression of natural or 

synthetic nucleic acids encoding a sequence of interest will 
typically be achieved by operably linking the DNA or cDNA to a 
promoter (which is either constitutive or inducible) , followed 
by incorporation into an expression vector. The vectors can- 

3 0 be suitable for replication and integration in either 

prokaryotes or eukaryotes. Typical expression vectors contain 
transcription and translation terminators, initiation 
sequences, and promoters useful for regulation of the 
expression of polynucleotide sequence of interest. To obtain 
3 5 high level expression of a cloned gene, it is desirable to 

construct expression plasmids which contain, at the minimum, a 
strong promoter to direct transcription, a ribosome binding 
site for translational initiation, and a 



transcription/translation terminator. The expression vectors 
may also comprise generic expression cassettes containing at 
least one independent terminator sequence, sequences 
permitting replication of the plasmid in both eukaryotes and 
5 prokaryotes, i.e., shuttle vectors, and selection markers for 

both prokaryotic and eukaryotic systems. See Sambrook et al . 

The E2F-RB fusion constructs of the invention can be 
introduced into the tissue of interest in vivo or ex vivo by a 
variety of methods. In some embodiments of the invention, the 

10 nucleic acid, preferably DNA, is introduced to cells by such 
methods as microinjection, calcium phosphate precipitation, 
liposome fusion, or biolistics. In further embodiments, the 
DNA is taken up directly by the tissue of interest. In other 
embodiments, the constructs are packaged into a viral vector 

15 system to facilitate introduction into cells. 

Viral vector systems useful in the practice of the 
instant invention include adenovirus, herpesvirus, adeno- 
associated virus, minute virus of mice (MVM) , HIV, sindbis 
virus, and retroviruses such as Rous sarcoma virus, and MoMLV. 

2 0 Typically, the constructs of the instant invention are 

inserted into such vectors to allow packaging of the E2F-RB 
expression construct, typically with accompanying viral DNA, 
infection of a sensitive host cell, and expression of the S2F- 
RB gene. A particularly advantageous vector is the adenovirus 
25 vector disclosed in Wills, et al. Human Gene Therapy 5:1079- 

1088 (1994) . 

In still other embodiments of the invention, the 
recombinant DNA constructs of the invention are conjugated to 
a cell receptor ligand for facilitated uptake (e.g., 

3 0 invagination of coated pits and internalization of the 

endosome) through a DNA linking moiety (Wu, et al . J. Biol. 
Chem. 263:14621-14624 (1988); WO 92/06180). For example, the 

DNA constructs of the invention can be linked through a 
polylysine moiety to asialo-oromucocid, which is a ligand for 
35 the asi'aloglycoprotein receptor of hepatocytes. 

Similarly, viral envelopes used for packaging the 
constructs of the invention can be modified by the addition of 
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receptor ligands or antibodies specific for a receptor to 
permit receptor-mediated endocytosis into specific cells 
(e.g., WO 93/20221, WO 93/14188; WO 94/06923). In some 
embodiments of the invention, the DNA constructs of the 
5 invention are linked to viral proteins, such as adenovirus 
particles, to facilitate endocytosis (Curiel, et al. Proc . 
Natl. Acad. Sci. U.S.A. 88:8850-8854 (1991)) . In other 

embodiments, molecular conjugates of the instant invention can 
include microtubule inhibitors (WO 94/0 6 922) ; synthetic 
10 peptides mimicking influenza virus hemagglutinin (Plank, et 
al. J. Biol. Chem. 269:12918-12924 (1994)); and nuclear 

localization signals such as SV40 T antigen (WO 93/19768) . 

In some embodiments of the invention, the RB 
polypeptides of the invention are administered directly to a 

15 patient in need of treatment. A "therapeutically effective" 
dose is a dose of polypeptide sufficient to prevent or reduce 
severity of a hyperprolif erative disorder. As used herein, 
the term "hyperprolif erative cells" includes but is not 
limited to cells having the capacity for autonomous growth, 

20 i.e., existing and reproducing independently of normal 

regulatory mechanisms. Hyperprolif erative diseases may be 
categorized as pathologic, i.e., deviating from normal cells, 
characterizing for constituting disease, or may be categorized 
as non-pathologic, i.e., deviation from normal but not 

25 associated with a disease state. Pathologic 

hyperprolif erative cells are characteristic of the following 
disease states: restenosis, diabetic retinopathy, thyroid 
hyperplasia, Grave's disease, psoriasis, benign prostatic 
hypertrophy, Li-Fraumeni syndrome including breast cancer, 

3 0 sarcomas and other neoplasms, bladder cancer, colon cancer, 
lung cancer, various leukemias and lymphomas. Examples of 
non-pathological hyperprolif erative cells are found, for 
instance, in mammary ductal epithelial cells during 
development of lactation and also in cells associated with 

3 5 wound repair. Pathological hyperprolif erative cells 

characteristically exhibit loss of contact inhibition and a 
decline in their ability to selectively adhere which implies a 
further breakdown in intercellular communication. These 
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changes include stimulation to divide and the ability to 
secrete proteolytic enzymes. 

The constructs of the invention are useful in the 
therapy of various cancers and other conditions in which the 
5 administration of RB is advantageous, including but not 
limited to peripheral vascular diseases and diabetic 
retinopathy. Although any tissue can be targeted for which 
some tissue-specific expression element, such as a promoter , 
can be identified, of particular interest is the tissue- 

10 specific administration of an RB construct for 

hyperprolif erative disorders such as restenosis, for which the 
smooth muscle actin promoter is preferable. 

The compositions of the invention will be formulated 
for administration by manners known in the art acceptable for 

15 administration to a mammalian subject, preferably a human. In 
some embodiments of the invention, the compositions of the 
invention can be administered directly into a tissue by 
injection or into a blood vessel supplying the tissue of 
interest. In further embodiments of the invention the 

20 compositions of the invention are administered 

" locoregionally " , i.e., intravesicaily , intralesionally , 
and/or topically. In other embodiments of the invention, the 
compositions of the invention are administered systemically by 
injection, inhalation, suppository, transdermal delivery, etc. 

25 In further embodiments of the invention, the compositions are 
administered through catheters or other devices to allow 
access to a remote tissue of interest, such as an internal 
organ. The compositions of the invention can also be 
administered in depot type devices, implants, or encapsulated 

3 0 formulations to allow slow or sustained release of the 
compositions . 

The invention provides compositions for 
administration which comprise a solution of the compositions 
of the invention dissolved or suspended in an acceptable 

35 carrier, preferably an aqueous carrier. A variety of aqueous 
carriers may be used, e.g., water, buffered water, 0.8% 
saline, 0.3% glycine, hyaluronic acid and the like. These 
compositions may be sterilized by conventional, well known 
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sterilization techniques, or may be sterile filtered. The 
resulting aqueous solutions may be packaged for use as is r or 
lyophilized, the lyophiiized preparation being combined with a 
sterile solution prior to administration. The compositions 
5 may contain pharmaceutically acceptable auxiliary substances 
as required to approximate physiological conditions, such as 
pH adjusting and buffering agents, tonicity adjusting agents, 
wetting agents and the like, for example, sodium acetate, 
sodium lactate, sodium chloride, potassium chloride, calcium 

10 chloride, sorbitan monolaurate, triethanolamine oleate, etc. 

The concentration of the compositions of the 
invention in the pharmaceutical formulations can vary widely, 
i.e., from less than about 0.1%, usually at or at least about 
2% to as much as 20% to 50% or more by weight, and will be 

15 selected primarily by fluid volumes, viscosities, etc., in 
accordance with the particular mode of administration 
selected. 

The compositions of the invention may also be 
administered via liposomes. Liposomes include emulsions, 

20 foams, micelles, insoluble monolayers, liquid crystals, 

phospholipid dispersions, lamellar layers and the like. In 
these preparations the composition of the invention to be 
delivered is incorporated as part of a liposome, alone or in 
conjunction with a molecule which binds to a desired target, 

25 such as antibody, or with other therapeutic or immunogenic 

compositions. Thus, liposomes either filled or decorated with 
a desired composition of the invention of the invention can 
delivered systemically, or can be directed to a tissue of 
interest, where the liposomes then deliver the selected 

3 0 therapeutic/immunogenic peptide compositions. 

Liposomes for use in the invention are formed from 
standard vesicle- forming lipids, which generally include 
neutral and negatively charged phospholipids and a sterol, 
such as cholesterol. The selection of lipids is generally 

35 guided by consideration of, e.g., liposome size, acid lability 
and stability of the liposomes in the blood stream. A variety 
of methods are available for preparing liposomes, as described 
in, e.g., Szoka et al . Ann. Re v. Biophys. Bioeng. 9:467 
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(1980), U.S. Patent Nos. 4,235,871, 4,501,728, 4,837,028, and 
5,019,369, incorporated herein by reference. 

A liposome suspension containing a composition of 
the invention may be administered intravenously, locally, 
5 topically, etc. in a dose which varies according to, inter 
alia f the manner of administration, the composition of the 
invention being delivered, and the stage of the disease being 
treated. 

For solid compositions, conventional nontoxic solid 

10 carriers may be used which include, for example, 

pharmaceutical grades of mannitol, lactose, starch, magnesium 
stearate, sodium saccharin, talcum, cellulose, glucose, 
sucrose, magnesium carbonate, and the like. For oral 
administration, a pharmaceutically acceptable nontoxic 

15 composition is formed by incorporating any of the normally 

employed excipients, such as those carriers previously listed, 
and generally 10-95% of active ingredient, that is, one or 
more compositions of the invention of the invention, and more 
preferably at a concentration of 25%-75%. 

20 For aerosol administration, the compositions of the 

invention are preferably supplied in finely divided form along 
with a surfactant and propellant. Typical percentages of 
compositions of the invention are 0.01%-20% by weight, 
preferably 1%-10%. The surfactant must, of course, be 

25 nontoxic, and preferably soluble in the propellant. 

Representative of such agents are the esters or partial esters 
of fatty acids containing from 6 to 22 carbon atoms, such as 
caproic, octanoic, lauric, palmitic, stearic, linoleic, 
linolenic, olesteric and oleic acids with an aliphatic 

3 0 polyhydric alcohol or its cyclic anhydride. Mixed esters, 
such as mixed or natural glycerides may be employed. The 
surfactant may constitute 0.1%-20% by weight of the 
composition, preferably 0.25-5%. The balance of the 
composition is ordinarily propellant. A carrier can also be 

35 included, as desired, as with, e.g., lecithin for intranasal 
delivery. 

The constructs of the invention can additionally be 
delivered in a depot-type system, an encapsulated form, or an 
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implant by techniques well-known in the art. Similarly, the 
constructs can be delivered via a pump to a tissue of 
interest . 

In some embodiments of the invention, the 
5 compositions of the invention are administered ex vivo to 

cells or tissues explanted from a patient, then returned to 
the patient. Examples of ex vivo administration of gene 
therapy constructs include Arteaga et al . Cancer Research 
56 (5) :1098-1103 (1996); Nolta et al . Proc Natl, Acad. Sci . USA 

io 93(6):24i4-9 (1996); Koc et al . Sem i na rs in Oncolog y 23 

(l):46-65 (1996); Raper et al . Annals of Surgery 223 (2) : 116-26 

(1996); Dalesandro et al. J. Thorac . Cardi . Surg. ll{2):416-22 

(1996); and Makarov et al. Proc, Natl. Acad. Sci. USA 
93 (1) :402-6 (1996) . 

15 In some embodiments of the invention, the constructs 

of the invention are administered to a cardiac artery after 
balloon angioplasty to prevent or reduce the severity of 
restenosis. The constructs of the invention can be used to 
coat the device used for angioplasty (see, for example, 

20 Willart, et al . Circulation 89:2190-2197 (1994); French, et 

al. Circulation 90:2402-2413 (1995)). In further embodiments, 

the fusion polypeptides of the invention can be used in the 
same manner. 

The following examples are included for illustrative 

25 purposes and should not be considered to limit the present 
invention. 

EXAMPLES 

3 0 E2F-RB Fusions 

a. i.nt reduction 

In this example, expression plasmids which encode 
different segments of E2F fused to RB5 6 polypeptide were 
constructed. RB56 is a subfragment of full length RB which 
3 5 contains the "pocket" domains necessary for growth suppression 
(Hiebert, et al . MCB 13:3384-3391 (1993); Gin, et al . Genes 
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and Dev. 6:353-954 (1932)}. E2F194 contains E2F amino acids 

95-194. This fragment contains only the DNA binding domain of 
E2F, E2F2 86 contains the DNA binding domain and the DP-1 
heterodimerization domain. Both E2F fragments lack the N- 
5 terminal cyclin A-kinase binding domain, which appears to 
down-regulate the DNA binding activity of E2F (Krek et al. 
Cell 83:1149-1158 (1995); Krek et al . Cell 78:161-172 (1994)). 

b. Construction , oi vectors 

10 Plasmid pCTM contains a CMV promoter, a tripartite 

adenovirus leader flanked by T7 and SP6 promoters, and a 
multiple cloning site with a bovine growth hormone (BGH) 
polyadenylation site and a SV-40 poly adenylation site 
downstream. A diagrammatic representation of pCTM is provided 

15 in Figure 3 . The DNA sequence for pCTM is provided in Figure 
4 . 

pCTMI was constructed from pCTM by digesting pCTM 
with Xho I and Not I and subcloning a 18 0 bp intron Xhol-Not I 
fragment from a pCMV-$-gal vector (Clone tech ) . A 

20 diagrammatic representation of pCTMI is provided in Figure 5. 
The DNA sequence is provided in Figure 6 . 

pCTMIE was constructed by amplifying the SV4 0 
enhancer from SV40 viral DNA in a polymerase chain reaction. 
The amplified product was digested with Bglll and inserted 

25 into BamHl-digested pCMTI and ligated in the presence of 

BatnHI. The plasmid is depicted diagrammatically in Figure 7. 
The DNA sequence is provided in Figure 8 . 

pCTM-RB was prepared as follows. A 3 . 2 KB Xba I - 
Cla I fragment of pETRBc (Huang et al . Nature 350:160-162 

3 0 (1991) ) containing the full length human RB cDNA was ligated 
to Xba I -Cla I digested pCTM. pCTM-RB5 6 was prepared by 
ligating the digested pCTM to a 1 . 7 KB Xba 1 -Cla I fragment 
containing the coding sequence for RB56. pCTMI-RB, pCTMIE-RB, 
pCTMI-RB56 (amino acids 381-928) and pCTMIE-RB56 (amino acids 

35 381-928) were all constructed by the same methods. 

C. RB-E2F fusion C onstructs 
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Figure 9 depicts the fusion constructs used in these 
studies. These E2F constructs commenced at amino acid 95 and 
lacked part of the cyclin A binding domain. S2F437 contained 
the DNA binding domain (black) , heterodimerization domain 
5 (white) and transactivation domain (stippled) . E2F194 

contained solely the DNA binding domain. E2F286 contained the 
DNA binding domain and DP-1 heterodimerization domain. RB56- 
5s refers to an RB variant having alanine substitutions at 
amino acid residues 606, 612, 788, 807 and 811. In E2F194- 

10 RB56-5s and E2F286-RB56 -5s , the E2F fragments were fused in 
frame to codon 379 of RB-5s. RB56-C706F contained an 
inactivating point mutation (Kaye et al . Proc . Nat 1 . Acad . 
Sci. U.S.A. 87:6922-6926 (1990)). 

pCMV-E2F194 and pCMV-E2F43 7 were constructed as 

15 follows. DNA encoding amino acids 95-194 of E2F (containing 

the DNA binding domain) or amino acids 95-437 was amplified in 
a polymerase chain reaction, digested with Hindi I, and ligated 
into Smal/Hindll digested pCMV-RB56 vectors. pCMVE2F286 was 
constructed by digesting pCMV-E2F43 7 with Aflll, treating the 

20 ends with DNA pol I (Klenow fragment) and religating in the 
presence of Aflll. The blunt end ligation created a stop 
codon at position 287. pCMV-E2F286-5s was constructed by 
ligating Aflll (blunt) /Hindlll digested pE2F437 to a Sal I 
(blunt) -Hindi I I fragment containing the RB56-5s coding 

25 sequence. pCTMIE-E2F194 -5s and pCTMIE-E2F286-RB5s were 

constructed by ligating EcoRI-EcoRV digested pCTMIE (4.2 KB) 
to Hindlll (blunt) -EcoRI fragments from either pCMV-E2F194- 
RB5s or pCMV-E2F286-RB5s. 

3 0 d. Promoter RepyessiQn 

To measure the effect of the E2F-RB fusion proteins, 
cervical carcinoma cell line C33A (ATCC # HTB-31) was 
transfected with equivalent amounts of E2F194-RB56 or E2F RB56 
with an E2-CAT reporter plasmid (See, e.g., Weintraub et al . 
35 Nature 358:259-261 (1992)), 

In the C33A assay, 250,000 C33A cells were seeded 
into each of well of 6-well tissue culture plates and allowed 
to adhere overnight. 5 ^ig each of pCMV-RB56, pCMV-E2F RB56, 
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or pCMV-E2F plasmid were cot ransf acted (calcium phosphate 
method, MBS transfection kit, Stratagene) with 5 ^g of 
indicated reporter construct E2-CAT or SVCAT) and 2 . 5 jug (3 -gal 
plasmid (pCMV-{3, Clontech) per well into duplicate wells. 
5 Cells were harvested 72 hour after transfection and extracts 
were prepared. 

In the 5637 assay, 250 , 000 5637 cells were seeded as 
described above. 1 jxg each of RB or E2F-RB fusion plasmid, 
E2-CAT or SV-CAT reporter plasmid and pCMV-3-galactosidase 

10 were cotransf ected using the lipofectin reagent (BRL, 
Bethesda, Maryland) according to the manufacturer's 
instructions . 

CAT assays were performed using either 20 JjlL (C33A) 
or 50 //L (5637) of cell extract (Gorman et al . Mol . Cell. 

15 Biol. 2:1044 (1982)). TLCs were analyzed on a Phosphoimager 
SF (Molecular Dynamics) . CAT activities were normalized for 
transfection efficiency according to 3 -galactosidase 
activities of each extract. (3-galactosidase activities of 
extracts were assayed as described by Rosenthal et al . < tyeth, 

20 Enzym. 152:704 (1987)). 

The results of these studies were as follows. 
Transfection of the E2-CAT reporter alone or in the presence 
of the nonfunctional control RB56-K209 mutant yielded 
relatively high CAT activity. Cotransf ection of wild- type 

25 RB56 or the variant RB56-5S resulted in a 10 to 12 fold 

repression of CAT activity, indicating that RB56 or RB56-5s 
are both capable of efficiently repressing E2F-dependent 
transcription. E2F194-RB5S and E2F286-RB5s repressed 
transcription approximately 50 fold. Transcriptional 

3 0 repression required both the RB5 6 and the E2F components of 
the fusion proteins, as expression of E2F194 and E2F286 did 
not mediate transcriptional repression. No repression of 
SV40-CAT transcription occurred with E2F-RB constructs, thus 
demonstrating the specificity of the transcriptional 

3 5 repression by E2FRB for the E2 promoter. These results are 
depicted diagrammatically in Figure 10 . 
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The ability of E2F-RB fusion polypeptides to cause 
Gl arrest in Saos-2 (RB-/- cells) (ATCC # HTB-85) and C33A 
cells was investigated. Previous studies have shown that R3- 
mediated E2 promoter repression and Gl arrest are linked in 
5 Saos-2 cells but dissociated in C33A (RBmut) cells (Xu, et al . 
PNAS 92:1357-1361 (1992)). Cells were washed in PBS and were 

fixed in 1 mL -2Q°C 70% ethanol for 30 minutes. Cells were 
collected by centrif ugation and resuspended in 0.5 mL 2% serum 
containing 10 ,ug/ml RNase A and incubated for 3 0 minutes at 

10 37°C 0.5 mL of PBS containing propidiurn iodide (100 /^g/ml) was 
added to each sample, mixed and cells were filtered through a 
FACS tube capstrainer. FACS analysis was performed on a FACS- 
Scan (Becton-Dickenson) using doublet discrimination. 5,000- 
10,000 CD20+ events were analyzed. Percent of cells in G 0 /G lf 

15 S, and G 2 /M was determined using Modfit modeling software. 

The results of this experiment were as follows. 
Both full length RB110 and the truncated version RB56, but not 
the control mutant RB-H209, caused G 1 arrest in Saos-2 cells 
(Table 1) . Similarly, the RB56-5s, E2F-194-RB56-5S and 

20 E2F286-RB56-5s all were capable of arresting cells in G 0 /G 1 . 

Transfection of the DNA binding domain, E2F194, did not block 
S-phase entry in Saos-2 as previously described for rodent 
cells (Dobrowolski, et al . Oncogene 9:2605-2612 (1994)). In 

contrast, RB110, RB56, and E2F-RB fusion proteins were not 
25 capable of arresting C33A cell lines indicating that the 

transcriptional repression observed in these cells does not 

translate into G l arrest. 

The ability of the E2F-RB fusion proteins to arrest 

5637 cells was also investigated (Table 2) . RB56 and RB56-5s 
30 both efficiently arrested cells in G 0 /G 1 (approximately 90% of 

cells in G 0 -G x ) , whereas E2F194-RB56-5s and E2F286-RB56-5s are 

slightly less efficient (about 80% of cells in Gq/GJ at 

promoting G 0 /Gi arrest. Without being limited to any one 

theory, the less efficient arrest of both Saos-2 and 5637 
35 cells by the E2F-RB fusion proteins appears due to the lower 

levels of steady-state protein produced in these cells (Figure 

11, panels b and c) . 
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Table 1: Cell Cycle Regulation fav RB and E2F-RB fusion proteins in RSneg cells 



% Cells 




CD20 + 
<VG, 


G,/M 


S-phase 


H209 


52.1 


27,1 


20.8 


p56RB 


78.8 


14,2 


7.0 


pllORB 


70.9 


14.3 


14.8 


p56RB-5s 


84.8 


13.2 


2.0 


p56RB-p5 


81.3 


11.5 


7.3 


E2F-194-5s 


77.8 


14.9 


7.3 


E2F-286-5s 


72.2 


15.0 


12,8 


E2F-194 


49.9 


28.0 


22.1 
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20 Table 2: Growth Suppression of 5637 Bladder Cells bv RB and E2F-RB fusion proteins 



5637/CD20+ 




% Cells 






GJG, 


S 


G 2 M 


CD20 


59.7 


16.9 


20.6 


RB56-C706F 


57,4 


16.3 


24.3 


RB56WT 


90,7 


4.12 


4.88 


RB56-5s 


89.91 


3.51 


6.1 


E2F1 94-5s 


80.1 


1.31 


0 


E2F-286-5s 


79.21 


8.1 


0 



F. Activity of Fusion Proteins in Functional RB Background 
The activity of the E2F-RB fusion proteins in a 
3 5 cellular background containing functional RB was then 

determined- NIH-3T3 cells were transfected with RB56 or E2F- 
RB56 fusions and stained with anti-RB monoclonal antibody 3C8 
(Wen et al . J. Immuno. Meth. 169:231-240 (1994)). FACS 

analysis was performed of the RB expressing cells. The 
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results are shown in Figure 12. The non-gated population (g) 
shows the characteristic cell cycle distribution for NIH-3T3 
cells (60% GO, 28% S, 10% G2/M) . In contrast, in cells 
transfected with RB56 (a,b) or E2F-RB fusion proteins (c-f ) , 
5 greater than 90% of the RB-expressing cells were arrested in 
Go/Gj.. These data demonstrate that the ability of RB and E2F- 
RB56 fusions to arrest cells in G 0 /G l is not limited to RB 
negative tumor cells. The relative levels of protein 
expressed in transfected NIH-3T3 cells was also investigated. 

10 RB110 was not expressed efficiently in these cells. 

Thus, these data demonstrate that E2F-RB fusion 
proteins are more efficient transcriptional repressors than 
either pRB or RB56 alone, and that RB can repress 
transcription by remaining bound to E2F rather than directly 

15 blocking the transactivation domain of E2F . These data 

support the use of E2F-RB fusions as RB agonists in both RB+ 
cells and in RB negative or RB mutant cells. 

Example IX. 

2 0 Tissue-Specific Expression of E2F-RB Fusions 

A. Construction of Recombinant Adenovirus: 

In this experiment , recombinant adenoviruses 
comprising an RB polypeptide under the control of a CMV or 
25 smooth muscle alpha act in promoter were generated. 

The smooth muscle a-actin promoter (bases -670 
through +5, Reddy et al . "Structure of the Human Smooth Muscle 
a-Actin Gene." J. Biol. Chem. 265:1683-1687 (1990), Nakano, 
et al. "Transcriptional Regulatory Elements In The 5 f Upstream 

3 0 and First Intron Regions of The Human Smooth Muscle (aortic 

type) a-Actin-Encoding Gene." Gene 99:285-289 (1991) was 

isolated by PCR from a genomic library with 5 1 Xho I and Avr 
II and 3' Xba I, Cla I and Hind III restriction sites added 
for cloning purposes. The fragment was subcloned as an Xho 1, 
35 Hind III fragment into a plasmid for sequencing to verify base 
composition. A fusion construct 286-56 containing the DNA and 
heterodimerization domain of E2F-1 (bases 95-286) linked to 
p56 (amino acids 379-928 of full length RB) was subcloned as 



an Xba I, Cla I fragment directly downstream of the smooth 
muscle a-actin promoter, and this expression cassette was 
digested out and cloned into the plasmid pAd/ITR/IX- as an Xba 
I to Avrll, and Cla I fragment to create the plasmid pASN286- 
56. This plasmid consisted of the adenovirus type 5 inverted 
terminal repeat (ITR) , packaging signals and Ela enhancer, 
followed by the human smooth muscle a-actin promoter and 286- 
56 cassette, and then Ad 2 sequence 4021-10462 (which contains 
the Elb/protein IX poly A signal) in a pBR322 background. 
Recombinant adenovirus was produced by standard procedures. 
The plasmid pASN286-56 was linearized with Ngo MI and co- 
transfected into 293 cells with the large fragment of Cla I 
digested rAd34 which has deletions in both the E3 and E4 
regions of adenovirus type 5. Ad34 was a serotype 5 derivative 
with a 1.9 KB deletion in early region 3 resulting from 
deletion of the Xba I restriction fragment extending from Ad5 
coordinates 28593 to 3 0470 and a 1.4 KB deletion of early 
region 4 resulting from a Taq 1 fragment of E4 (coordinates 
33 055-35573) being replaced with a cDNA containing E4 ORF 6 
and 6/7. 

Recombinant adenovirus produced by homologous 
recombination was isolated and identified by restriction 
digest analysis and further purified by limiting dilution. 
Additional control recombinant adenoviruses are described 
elsewhere and include the control virus ACN (CMV promoter, 
Wills, et al. "Gene Therapy For Hepatocellular Carcinoma: 
Chemosensitivity Conferred By Adenovirus -Mediated Transfer of 
The HSV-1 Thymidine Kinase Gene." C ancer Q ene Therapy 2:191- 
197 (19 95) ) , and ACN56 {RB expressed under control of a CMV 
promoter) . 

ACN56 was prepared as follows. A plasmid containing p56 
cDNA was constructed by replacing the p53 cDNA from the 
plasmid ACNP53 (Wills et al . Human Gene Therapy 5:1079-1088 

(1994) ) with a 1.7 KB Xba I- BamHI fragment isolated from 
plasmid pET 9a-Rb56 (Antelman et al. Oncogene 10:697-704 

(1995) ) which contains p56 cDNA. The resulting plasmid 
contained amino acids 381-928 of p56, the Ad5 inverted 
terminal repeat, viral packaging signals and Ela enhancer, 



followed by the human cytomegalovirus immediate early promocer 
(CMV) and Ad 2 tripartite leader cDNA to drive p56 expression. 
The p56 cDNA was followed by Ad 2 sequence 4021-10462 in a 
pBR322 background. This plasmid was linearized with EcoRI 
5 and cotransf ected with the large fragment of bsp 106 digested 
DL327 (E3 deleted; Thimmappaaya et al. Cell 31:543-551 (1982)) 
or h5ile4 (E4 deleted; Hemstrom et ai. J . Virol. 62:3258-3264 
(1988)). Recombinant viruses were further purified by 
limiting dilution . 

10 

B. Cellular Proliferation 

In this experiment, cell lines were infected in 
culture with recombinant adenovirus RB constructs to ascertain 

15 the relative expression of the RB polypeptide and the effect 
on cell proliferation. 

For H35 8 (ATCC # Crl 5807) and MDA-MB4 68 (ATCC # HTB 
132, breast adenocarcinoma) cells, 5,000 cell/well were plated 
in normal growth media in a 96 well microtiter plate (Costar) 

20 and allowed to incubate overnight at 37°C, 7% CQ 2 . Viruses 

were serially diluted in growth media and used to infect cells 
at the indicated doses for 4 8 hours. At this point, 3 H- 
thymidine was added (Amersham, 0.5 £iCi/well) and the cells 
were incubated at 37°C for another 3 hours prior to harvest. 

25 Both A7r5 (ATCC CRL1444, rat smooth muscle) and A10 (ATCC CRL 
1476, rat smooth muscle) cells were seeded at 3,000 cells/well 
in either DME + 0.5% FCS or DME + 20% FCS respectively. Virus 
was serially diluted in the seeding media and used to infect 
the cells at the doses indicated in the Figures. The 

3 0 infection and labelling procedure were the same for A10 cells 
as with the H358 and MDA-MB468 cells except that 2 juCi/well of 
label was used. The A7r5 cells were not infected with virus 
until 48 hours after seeding. Forty eight hours after 
infection, the serum concentration was raised to 10% FCS and 2 

35 AiCi/well of 3 H-thymidine was added and incubation continued 
for an additional 3 hours prior to harvest. All cells were 
harvested by aspirating media from the wells, trypsinization 
of the ceils, and harvesting using a 96 well GF/C filter with 
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a Packard Top count cell harvester. Results are plotted as 
the mean percentage (+/- SD) of media treated control 
proliferation versus dose of virus in Figures 13 and 14 . 

Thus, Figure 13 depicts a comparison of the effects 
5 of adenovirus p56 constructs on muscle cells A10 and A7R5 
cells. The CMV-driven p56 (ACN 56) virus inhibited A10 
growth to approximately the same extent as the actin promoter- 
driven E2F-fusion constructs (ASN586-56 #25,26) . In Figure 
14, the effects of adenovirus constructs on inhibition of a 

10 breast cancer cell line, MDA M(3468 and a non- small cell lung 
carcinoma cell line, H358, are depicted. In these 
experiments, actin promoter-driven E2F-p56 was ineffective, 
while the CMV promoter-driven p56 was effective in inhibiting 
growth of non-smooth muscle cells. 

15 To determine whether the non- smooth muscle cells 

were more infectable with adenovirus than the smooth muscle 
cell lines used, the four cells lines, H358, MB468, A7R5 , and 
A10 were infected at an MO I of 5 with an adenovirus expressing 
(3-galactosidase (ACf3GL; Wills, et al . Human Gene Therapy 

20 5:1079-1088 (1994)) and degree of (3-gal staining was examined. 

As shown in Figure 15 (top) , the non- smooth muscle cell lines 
were significantly more infectable than the smooth muscle cell 
lines. In a further test, cells were infected at higher 
multiplicities of infection (50, 100, 250, 500) with ACN56 and 

25 the amount of p56 present in the infected cells detected by 
autoradiography. As can be seen in Figure 15 (bottom) , the 
non-muscle cell lines had significantly more p56 present, 
since as a result of their greater infectivity, infected cells 
have a greater viral load and thus more copies of the p5 6 

3 0 template driven by the non- tissue specific CMV promoter. 

In a further experiment, the specificity of the 
actin smooth muscle promoter for smooth muscle tissue was 
ascertained. In this experiment, (3-gal expression levels in 
cells infected with [3-gal constructs driven with different 

3 5 promoters were measured. As can be seen in Figure 19, despite 
the lower infectivity of the smooth muscle cells, expression 
was only evident in these cells using the smooth muscle alpha 
actin promoter. 
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Figure 21 depicts a comparison of the effects of a 
CMV driven p56 recombinant adenovirus (ACN5 6S4) vs a human 
smooth muscle alpha-actin promoter driven E2F-p56 fusion 
construct (ASN286-56) vs control adenoviral construct 
5 containing either the CMV or smooth muscle alpha-actin 

promoters without a downstream transgene (ACNE3 or ASBE3-2 
isolates shown, respectively) . Assays were 3H-thymidine 
uptake either in a smooth muscle cell line (A7R5) or a non- 
muscle cell line (MDA-MB468, breast carcinoma). Results 
10 demonstrated muscle tissue specificity using the smooth muscle 
alpha-actin promoter and specific inhibition of both the p56 
and E2F-p56 transgenes relative to their respective controls. 

C. Inhibition of Restenosis 

15 The model of balloon injury was based on that 

described by Clowes, et al . (Clowes, Lab. Invest. 49:3 27-333 
(1983)) . Male Sprague-Dawley rats weighing 400-500g were 
anesthetized with an intraperitoneal injection of sodium 
pentobarbital (45 mg/kg. Abbot Laboratories, North Chicago, 

20 Illinois) . The bifurcation of the left common carotid artery 
was exposed through a midline incision and the left common, 
internal, and external carotid arteries were temporarily 
ligated. A 2F embolectomy catheter (Baxter Edwards Healthcare 
Corp., Irvine, CA) was introduced into the external carotid 

2 5 and advanced to the distal ligation of the common carotid. 

The balloon was inflated with saline and drawn towards the 
arteriotomy site 3 times to produce a distending, 
deendothelializing injury. the catheter was then withdrawn. 
Adenovirus (1 x 10 9 pfu of Ad-RB (ACNRb) or Ad-p56 (ACN56) in 

3 0 a volume of 10;ul diluted to 100/^1 with 15% (wt/vol) Poloxamer 

407 (BASF, Parsippany, N.J.) or Ad-|3-Gal (1 x 10 9 pfu, diluted 
as above) was injected via a canula, inserted just proximal to 
the carotid bifurcation into a temporarily isolated segment of 
the artery. The adenovirus solution was incubated for 20 
3 5 minutes after which the viral infusion was withdrawn and the 
cannula removed. The proximal external carotid artery was 
then ligated and blood flow was restored to the common carotid 
artery by release of the ligatures. The experimental protocol 
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was approved by the Institutional Animal Care and Use 
Committee and complied with the "Guide for the Care and Use of 
Laboratory Animals. ,r (NIK Publication No. 86-23, revised 
1985) . 

5 Rats were sacrificed at 14 days following treatment 

with an intraperitoneal injection of pentobarbital (100 
mg/kg.). The initially balloon injured segment of the left 
common carotid artery, from the proximal edge of the omohyoid 
muscle to the carotid bifurcation, was perfused with saline 

10 and dissected free of the surrounding tissue. The tissue was 
fixed in 100% methanol until imbedded in paraffin. Several 4- 
jum sections were cut from each tissue specimen. One section 
from each specimen was stained with hematoxylin and eosin and 
another with Richardson 1 s combination elastic- trichrome stain 

15 conventional light microscopic analysis. 

Histological images of cross sections of hematoxylin 
and eosin or elastic-trichrome stained arterial sections were 
projected onto a digitizing board (Summagraphics) and the 
intimal, medial and luminal areas were measured by 

20 quantitative morphometric analysis using a computerized 

sketching program (MACMEASURE, version 1.9, National Institute 
of Mental Health) . 

Results were expressed as the mean ± S.E.M. 
Differences between groups were analyzed using an unpaired 

25 two-tailed Student's t test. Statistical significance was 

assumed when the probability of a null effect was <0.05. 

Results are shown in Figures 17 and 18. In Figure 
17, the relative inhibition of neointima formation is depicted 
graphically, demonstrating the ability of p56 and RB to 

3 0 inhibit neointima formation. Figure 18 provides photographic 
evidence of the dramatic reduction of neointima in the 
presence of p56. 

Adenovirus -treated carotid arteries were harvested 
from rats at 2 days following balloon injury and infections. 

35 Tissue was fixed in phosphate-buffered formalin until embedded 
in paraffin. Tissue was cut into 4,um cross-sections and 
dewaxed through xylene and graded alcohols . Endogenous 
peroxidase was quenched with 1% hydrogen peroxide for 3 0 
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minutes. Antigen retrieval was performed in 10mM sodium 
citrate buffer, pH 6.0 at 95°C for 10 minutes. A monoclonal 
anti-RB antibody (AB-5 , Oncogene Sciences, Uniondale, New 
York) was applied 10//g/ml in PBS in a humid chamber at 4°C for 
5 24 hours . Secondary antibody was applied from the Unitect 

Mouse Immunohistochemistry Kit (Oncogene Sciences, Uniondale, 
New York) according to the manufacturer's instructions. The 
antibody complexes were visualized using 3 , 3 ! -diaminobenzidene 
(DAB, Vector Laboratories, Burlingame, CA) . Slides were thin 
10 counterstained with hematoxylin and mounted. The results are 
depicted in Figure 20. 

All references cited herein are hereby incorporated 
by reference in their entirety for all purposes. 
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WHAT IS CLAIMED IS: 



1 1. A polypeptide comprising a fusion of a 

2 transcription factor, the transcription factor comprising a 

3 DNA binding domain, and a retinoblastoma (RB) polypeptide, the 

4 RB polypeptide comprising a growth suppression domain. 

1 2 . A nucleic acid encoding the fusion polypeptide 

2 of claim 1 . 

1 3. The nucleic acid of claim 2, wherein the 

2 nucleic acid in inserted in an adenovirus vector. 

1 4. The polypeptide of claim 1, wherein the 

2 transcription factor is E2F. 

1 5. The polypeptide of claim 4, wherein the cyclin 

2 A binding domain of the E2F is deleted or nonfunctional. 

1 6. The polypeptide of claim 1, wherein the 

2 retinoblastoma polypeptide is RB56. 

1 7. The polypeptide of claim 1, wherein the 

2 retinoblastoma polypeptide is wild type RB . 

1 8. The polypeptide of claim 1, wherein the 

2 retinoblastoma polypeptide comprises from about amino acid 

3 residue 379 to about amino acid residue 928 of pRB . 

1 9. The polypeptide of claim 1, wherein the 

2 retinoblastoma polypeptide comprises at least one substitution 

3 of amino acid residues selected from the group consisting of 

4 2, 608, 612, 788, 807, and 811 of pRB . 

1 10. The polypeptide of claim 5, wherein the E2F 

2 comprises about amino acid residues 95 to about 286. 
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1 11. The polypeptide of claim 4, wherein uhe S2F 

2 comprises about amino acid residues 95 to about 194. 

1 12. The polypeptide of claim 1, wherein the fusion 

2 comprises EF2 amino acid residues from about 95 to about 194 

3 operatively linked to RB amino acid residues from about 3 79 to 

4 about 928. 

1 13 . An expression vector comprising DNA encoding a 

2 polypeptide, the polypeptide comprising a fusion of a 

3 transcription factor, the transcription factor comprising a 

4 DNA binding domain, and a retinoblastoma (RB) polypeptide, the 

5 RB polypeptide comprising a growth suppression domain. 

1 14. The vector of claim 13, comprising a tissue- 

2 specific promoter operatively linked to DNA encoding the 

3 fusion. 

1 15. The vector of claim 14, wherein the tissue 

2 specific promoter is a smooth muscle actin promoter. 

1 16. A method for treatment of a hyperprolif erative 

2 disorder in a patient comprising administering to a patient a 

3 therapeutically effective dose of a fusion polypeptide 

4 comprising a fusion of a transcription factor, the 

5 transcription factor comprising a DNA binding domain, and a 

6 retinoblastoma (RB) polypeptide, the RB polypeptide comprising 

7 a growth suppression domain. 

1 17. The method of claim 16, wherein the fusion 

2 protein is encoded by a nucleic acid delivered to the patient. 

1 18. The method of claim 16, wherein the 

2 transcription factor is E2F. 

1 19. The method of claim 18, wherein the cyclin A 

2 binding domain of the E2F is deleted or nonfunctional. 
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1 20. The method of claim 16, wherein the RB is RB56. 

1 21. The method of claim 16, wherein the RB is wild 

2 type RB5 6. 

1 22. The method of claim 16, wherein the RB 

2 comprises from about amino acid residue 3 79 to about amino 

3 acid residue 928. 

1 23. The method of claim 16, wherein the RB 

2 comprises at least one substitution of amino acid residues 

3 selected from the group consisting of 2, 608, 612, 788, 807, 

4 and 811. 

1 24. The method of claim 18, wherein the E2F 

2 comprises about amino acid residues 95 to about 286. 

1 25. The method of claim 18, wherein the E2F 

2 comprises about amino acid residues 95 to about 194. 

1 26. The method of claim 16, wherein the fusion 

2 comprises EF2 amino acid residues from about 95 to about 194 

3 operatively linked to RB amino acid residues from about 379 to 

4 about 928. 

1 27. The method of claim 18, wherein the E2F -RB 

2 fusion polypeptide is expressed under the control of a tissue- 

3 specific promoter. 

1 28. The method of claim 27, wherein the tissue 

2 specific promoter is a smooth muscle actin promoter. 

1 29. The method of claim 16, wherein the 

2 hyperprolif erative disorder is cancer. 

1 30. The method of claim 29, wherein the cancer is 

2 bladder cancer. 



35 

1 31. The method of claim 29, wherein the 

2 hyperprolif erative disorder is restenosis. 

1 32. The method of claim 31, wherein the E2F-RB 

2 fusion polypeptide is administered after angioplasty. 

1 33. The method of claim 32, wherein the E2F-RB 

2 fusion polypeptide is administered as a coating on an 

3 angioplasty device. 

1 34. The method of claim 17, wherein the nucleic 

2 acid is administered after angioplasty. 

1 35. The method of claim 17, wherein the nucleic 

2 acid is administered as a coating on an angioplasty device. 

1 36. The method of claim 17, wherein the nucleic 

2 acid is inserted in an adenovirus vector. 
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TISSUE SPECIFIC EXPRESSION OF RETINOBLASTOMA 

PROTEIN 

ABSTRACT OF THE DISCLOSURE 

Fusions of the transcription factor E2F and the 
retinoblastoma protein RB are provided, along with methods 
treatment of hyperprolif erative diseases. 
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1270 1280 1290 1300 1310 1320 

CCGCTGGTGG CGGCCGACTC GCTCCTGGAG CATGTGCGGG AGGACTTCTC CGGCCTCCTC 

1330 1340 1350 1360 1370 1380 

CCTGAGGAGT TCATCAGCCT TTCCCCACCC CACGAGGCCC TCGACTACCA CTTCGGCCTC 

1390 1400 1410 1420 1430 1440 

GAGGAGGGCG AGGGCATCAG AGACCTCTTC GACTGTGACT TTGGGGAC CT CACCCCCCTG 

1450 1460 1470 1480 1490 1500 

GATTTCTGAC AGGGCTTGGA GGGACCAGGG TTTC CAGAGT AGCTCACCTT GTCTCTGCAG 

1510 1520 1530 1540 1550 1560 

CCCTGGAGCC CCCTGTCCCT GGCCGTCCTC CCAGCCTGTT TGGAAACATT TAATTTATAC 

1570 1580 1590 1600 1610 1620 

CCCTCTCCTC TGTCTCCAGA AGCTTCTAGC TCTGGGGTCT GGCTACCGCT AGGAGGCTGA 

1630 1640 1650 1660 1670 1680 

GCAAGCCAGG AAGGGAAGGA GTCTGTGTGG TGTGTATGTG CATGCAGCCT ACACCCACAC 

1690 1700 1710 1720 1730 1740 

GTGTGTACCG GGGGTGAATG TGTGTGAGCA TGTGTGTGTG CATGTACCGG GGAATGAAGG 

1750 1760 1770 1780 1790 1800 

TGAACATACA CCTCTGTGTG TGCACTGCAG ACACGCCCCA G.TGTGTCCAC ATGTGTGTGC 

1810 1820 1830 1840 1850 I860 

ATGAGTCCAT CTCTGCGCGT GGGGGGGCTC TAACTGCACT TTCGGCCCTT TTGCTCGTGG 

1870 1880 1890 1900 1910 1920 

GGTCCCACAA GGCCCAGGGC AGTGCCTGCT CCCAGAATCT GGTGCTCTGA CCAGGCCAGG 

1930 1940 1950 1960 1970 1980 

TGGGGAGGCT TTGGCTGGCT GGGCGTGTAG GACGGTGAGA GCACTTCTGT CTTAAAGGTT 

1990 2000 2010 2020 2030 2040 

TTTTCTGATT GAAGCTTTAA TGGAGCGTTA TTTATTTATC GAGGCCTCTT TGGTGAGCCT 

2050 2060 2070 2080 2090 2100 

GGGGAATCAG CAAAAGGGGA GGAGGGGTGT GGGGTTGATA CCCCAACTCC CTCTACCCTT 

2110 2120 2130 2140 2150 2160 

GAGCAAGGGC AGGGGTCCCT GAGCTGTTCT TCTGCCCCAT ACTGAAGGAA CTGAGGCCTG 

2170 2180 2190 2200 2210 2220 

GGTGATTTAT TTATTGGGAA AGTGAGGGAG GGAGACAGAC TG AC TG AC AG CCATGGGTGG 

2230 2240 2250 2260 2270 2280 

TCAGATGGTG GGGTGGGCCC TCTCCAGGGG GCCAGTTCAG GGCCCAGCTG CCCCCCAGGA 

2290 2300 2310 2320 2330 2340 

TGGATATGAG ATGGGAGAGG TGAGTGGGGG ACCTTCACTG ATGTGGGCAG GAGGGGTGGT 

2350 2360 2370 2380 2390 2400 

GAAGGCCTCC CCCAGCCCAG ACCCTGTGGT CCCTCCTGCA GTSTCTGAAG CGCCTGCCTC 

2410 2420 2430 2440 2450 2460 

CCCACTGCTC TGCCCCACCC TCCAATCTGC ACTTTGATTT GCTTCCTAAC AGCTCTGTTC 

2470 2480 2490 2500 2520 2520 

CCTCCTGCTT TGGTTTTAAT AAATATTTTG ATGACGTTAA AAAAAGGAAT TCGATAT 
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1 ttccggtttt tctcagggga cgttgaaatt atttttgtaa cgggagtcgg gagaggacgg 
61 ggcgtgcccc gcgtgcgcgc gcgtcgtcct ccccggcgct cctccacagc tcgctggctc 
121 ccgccgcgga aaggcgtcat gccgcccaaa accccccgaa aaacggccgc caccgccgcc 
181 gctgccgccg cggaaccccc ggcaccgccg ccgccgcccc ctcctgagga ggacccagag 
2 41 caggacagcg gcccggagga cctgcctctc gtcaggctcg agtttgaaga aacagaagaa 
301 cctgatttta ctgcattatg tcagaaatta aagataccag atcatgtcag agagagagct 
361 tggttaactt gggagaaagt ttcatctgtg gatggagtat tgggaggtta tattcaaaag 
421 aaaaaggaac tgtggggaat ctgtatcttt attgcagcag ttgacctaga tgagatgtcg 
481 ttcactttta ctgagctaca gaaaaacata gaaatcagtg tccataaatt ctttaactta 
541 ctaaaagaaa ttgataccag taccaaagtt gataatgcta tgtcaagact gttgaagaag 
601 tatgatgtat tgtttgcact cttcagcaaa ttggaaagga catgtgaact tatatatttg 
661 acacaaccca gcagttcgat atctactgaa ataaattctg cautggtgct aaaagtttct 
721 tggatcacat ttttattagc taaaggggaa gtattacaaa tggaagatga tctggtgatt 
781 tcatttcagt taatgctatg tgtccttgac tattttatta aactctcacc tcccatgttg 
841 ctcaaagaac catataaaac agctgttata cccattaatg gttcacctcg aacacccagg 
901 cgaggtcaga acaggagtgc acggatagca aaacaactag aaaatgatac aagaattatt 
9 61 gaagttctct gtaaagaaca tgaatgtaat atagatgagg tgaaaaatgt ttatttcaaa 
1021 aattttatac cttttatgaa ttctcttgga cttgtaacat ctaatggact tccagaggtt 
1081 gaaaatcttt ctaaacgata cgaagaaatt tatcttaaaa ataaagatct agatgcaaga 
1141 ttatttttgg atcatgataa aactcttcag actgattcta tagacagttt tgaaacacag 
1201 agaacaccac gaaaaagtaa ccttgatgaa gaggtgaatg taattcctcc acacactcca 

12 61 gttaggactg ttatgaacac tatccaacaa ttaatgatga ttttaaattc agcaagtgat 
1321 caaccttcag aaaatctgat ttcctatttt aacaactgca cagtgaatcc aaaagaaagt 

13 81 atactgaaaa gagtgaagga tataggatac atctttaaag agaaatttgc taaagctgtg 
1441 ggacagggtt gtgtcgaaat tggatcacag cgatacaaac ttggagttcg cttgtattac 
1501 cgagtaatgg aatccatgct taaatcagaa gaagaacgat tatccattca aaattttagc 
1561 aaacttctga atgacaacat ttttcatatg tctttattgg cgtgcgctct tgaggttgta 
1621 atggccacat atagcagaag tacatctcag aatcttgatt ctggaacaga tttgtctttc 
1681 ccatggattc tgaatgtgct taatttaaaa gcccttgatt tttacaaagt gatcgaaagt 
1741 tttatcaaag cagaaggcaa cttgacaaga gaaatgataa aacatttaga acgatgtgaa 
1801 catcgaatca tggaatccct tgcatggctc tcagattcac ctttatttga tcttattaaa 
1861 caatcaaagg accgagaagg accaactgat caccttgaat ctgcttgtcc tcttaatctt 
1921 cctctccaga ataatcacac tgcagcagat atgtatcttt ctcctgtaag atctccaaag 
1981 aaaaaaggtt caactacgcg tgtaaattct actgcaaatg cagagacaca agcaacctca 
2041 gccttccaga cccagaagcc attgaaatct acctctcttt cactgtttta taaaaaagtg 
2101 tatcggctag cctatctccg gctaaataca ctttgtgaac gccttctgtc tgagcaccca 
2161 gaattagaac atatcatctg gacccttttc cagcacaccc tgcagaatga gtatgaactc 
2221 atgagagaca ggcatttgga ccaaattatg atgtgttcca tgtatggcat atgcaaagtg 
2281 aagaatatag accttaaatt caaaatcatt gtaacagcat acaaggatct tcctcatgct 
2341 gttcaggaga cattcaaacg tgttttgatc aaagaagagg agtatgattc tattatagta 
2401 ttctataact cggtcttcat gcagagactg aaaacaaata ttttgcagta tgcttccacc 
2461 aggcccccta ccttgtcacc aatacctcac attcctcgaa gcccttacaa gtttcctagt 
2521 tcacccttac ggattcctgg agggaacatc tatatttcac ccctgaagag tccatataaa 
2581 atttcagaag gtctgccaac accaacaaaa atgactccaa gatcaagaat cttagtatca 
2641 attggtgaat cattcgggac ttctgagaag ttccagaaaa taaatcagat ggtatgtaac 
2701 agcgaccgtg tgctcaaaag aagtgctgaa ggaagcaacc ctcctaaacc actgaaaaaa 
27 61 ctacgctttg atattgaagg atcagatgaa gcagatggaa gtaaacatct cccaggagag 
2821 tccaaatttc agcagaaact ggcagaaatg acttctactc gaacacgaat gcaaaagcag 
2881 aaaatgaatg atagcatgga tacctcaaac aaggaagaga aatgaggatc tcaggacctt 
2941 ggtggacact gtgtacacct ctggattcat tgtctctcac agatgtgact gtat 
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"MPPKTPRKTAATAAAAAAEPPAPPPPPPPEEDPEQDSGPEDLPL 
VRLEFEETEEPDFTALCQKLKIPDHVRERAWLTWEKVSSVDGVLGGYIQKKKELWGIC 
IFIAAVDLDEMSFTFTELQKNIEISVHKFFNLLKEIDTSTKVDNAMSRLLKKYDVLFA 
LFSKLERTCELIYLTQPSSSISTEINSALVLKVSWITFLLAKGEVLQMEDDLVISFQL 
MLCVLDYFIKLSPPMLLKEPYKTAVIPINGSPRTPRRGQNRSARIAKQLENDTRIIEV 
LCKEHECNIDEVKNVYFKNF I PFMNS LGLVTSNGL P EVENLS KRYEE I YLKNKDLDAR 
LFLDHDKTLQTDS IDS F ETQRT PRKSNLDEEVNVI P PHT P VRTVKNT I-QQLMMI LNS A 
SDQPSENLISYFNNCTWPKESILKRVKDIGYIFKEKFAKAVGQGCVEIGSQRYKLGV 
RLYYRVMESMLKSEEERLSIQNFSKLLNDNIFHMSLLACALEWKATYSRSTSQNLDS 
GTDLSF PWI LNVLNLKAFDFYKVIES F I KAEGNLTREMIKHLERC EHRIMES LAWLSD 
SPLFDLIKQSKDREGPTDKLESACPLNLPLQNNHTAADMYLSPVRSPKKKGSTTRVNS 
TANAETQATSAFQTQKPLKSTSLSLFYKKVYRLAYLRLNTLCERLLSEHPELEHIIWT 
LFQHTLQNEYELMRDRHLDQIMMCSMYGICKVKNIDLKFKIIVTAYKDLPHAVQETFK 
RVLIKEEEYDSIIVFYNSVFMQRLKTNILQYASTRPPTLSPIPHIPRSPYKFPSSPLR 
IPGGNIYISPLKSPYKISEGLPTPTKMTPRSRILVSIGESFGTSEKFQKINQMVCNSD 
RVLKRSAEGSNPPKPLKKLRFDIEGSDEAJDGSKHLPGESKFQQKLAEMTSTRTRMQKQ 

KMNDSMDTSNKEEK" 
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>HincII 
! 

>AccI 

i I 

>BglII >SalI 
I ! i I 

10 i 20 30 Ml 40 50 60 

★ *j* * * * * * * * 

GACGGATCGG GAGATCTCCC GATCCCCTAT GGTCGACTCT CAGTACAATC TGCTCTGATG 

>AlwNI 
I 

70 80 90 100 110 120 

*■* * * **■ * * * * 

CCGCATAGTT AAGCCAGTAT CTGCTCCCTG CTTGTGTGTT GGAGGTCGCT GAGTAGTGCG 

>ApoI >MfeI 

i i 

|130 140 150 160 I 170 180 

CGAGCAAAAT TTAAGCTACA ACAAGGCAAG GCTTGACCGA CAATTGCATG AAGAATCTGC 

>HincII 

>Af 1III 
I 

>NruI >MluI 
! I 
190 200 1210 220 1 230 

* * * * * j * -k * *- J * 

TTAGGGTTAG GCGTTTTGCG CTGCTTCG CGA TGT ACG GGC CAG ATA TAC GCG TTG 

Arg Cys Thr Gly Gin He Tyr Ala Leu> 
d d CMV PROMOTER d d > 

>SpeI >AseI 
I i 

240 250 I 260 270 280 

* * J * * | ★ * IT * * 

ACA TTG ATT ATT GAC TAG TTA TTA ATA GTA ATC AAT TAC GGG GTC ATT 
Thr Leu He He Asp *** Leu Leu He Val He Asn Tyr Gly Val Ile> 
d d d d d d_CMV PROMOTER d d d d d d > 

290 300 310 320 330 

AGT TCA TAG CCC ATA TAT GGA GTT CCG CGT TAC ATA ACT TAC GGT AAA 
Ser Ser *** Pro He Tyr Gly Val Pro Arg Tyr He Thr Tyr Gly Lys> 
d d d d d d_CMV PROMOTER d d d d d d > 

>BglI >AatII 

I i 
340 350 360 370 i 

* * * * * * * * * 

TGG CCC GCC TGG CTG ACC GCC CAA CGA CCC CCG CCC ATT GAC GTC AAT 
Trp Pro Ala Trp Leu Thr Ala Gin Arg Pro Pro Pro He Asp Val Asn> 
d d d d d d CMV PROMOTER d d d d d d > 



380 390 400 410 420^ 

* * * * * * ** ★ * 

AAT GAC GTA TGT TCC CAT AGT AAC GCC AAT AGG GAC TTT CCA TTG ACG 
Asn Asn Val Cys Ser His Ser Asn Ala Asn Arg Asp Phe Pro Leu Thr> 
d * d d d d d CMV PROMOTER d d d d d d > 
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>AatII >Bgl 



T 



430 440 450 460 j 470 

|* * * * * *★} + * * 

TCA ATG GGT GGA CTA TTT ACG GXA AAC TGC CCA CTT GGC AGT AC A TCA 
Ser Met Gly Gly Leu Phe Thr Val Asn Cys Pro Leu Gly Ser Thr Ser> 
d d d d d d_CMV PROMOTER d d d d d d > 

>NdeI >AatII 

I * I 

480 ! 490 500 510 i 520 

★ j* * * * * *| * * 

AGT GTA TCA TAT GCC AAG TAC GCC CCC TAT TGA CGT CAA TGA CGG TAA 
Ser Val Ser Tyr Ala Lys Tyr Ala Pro Tyr *** Arg Gin *** Arg ***> 
d d d d d d_CMV PROMOTER d d d d d d > 

>BglI 
i 

530 ! 540 550 560 570 

* *j* * * ★ * 

ATG GCC CGC CTG GCA TTA TGC CCA GTA CAT GAC CTT ATG GGA CTT TCC 
Met Ala Arg Leu Ala Leu Cys Pro Val His Asp Leu Met Gly Leu Ser> 
d d d d d d_CMV PROMOTER d d d d d d > 

>BsaAI >NcoI 

I I 
>SnaBI >StyI >MslI 

i I I 

580 590 . 600 610 | 

TAC TTG GCA GTA CAT CTA CGT ATT AGT CAT CGC TAT TAC CAT GGT GAT 
Tyr Leu Ala Val His Leu Arg lie Ser His Arg Tyr Tyr His Gly Asp> 
d d d d d d_CMV PROMOTER d d d d d d > 

620 630 640 . 650 660 

*• * * * * * ★ * ★ * 

GCG GTT TTG GCA GTA CAT CAA TGG GCG TGG ATA GCG GTT TGA CTC ACG 
Ala Val Leu Ala Val His Gin Trp Ala Trp lie Ala Val *** Leu Thr> 
d d d d d d_CMV PROMOTER d d d d d d > 

>AatII >BanI 

i i 
670 680 690 ! 700 710 J 

* * * * * * j *• * + * 

GGG ATT TCC AAG TCT CCA CCC CAT TGA CGT CAA TGG GAG TTT GTT TTG 
Gly lie Ser Lys Ser Pro Pro His *** Arg Gin Trp Glu Phe Val Leu> 
d d d d d d_CMV PROMOTER d d d d d d > 

720 730 740 750 760 

* * * * *■* * 

GCA CCA AAA TCA ACG GGA CTT TCC AAA ATG TCG TAA CAA CTC CGC CCC 
Ala Pro Lys Ser Thr Gly Leu Ser Lys Met Ser *** Gin Leu Arg Pro> 
d d d d d d_CMV PROMOTER d d d d d -d -> 

770 780 790 800 ^ 810 

ATT GAC GCA AAT GGG CGG TAG GCG TGT ACG GTG GGA GGT CTA TAT AAG 
lie Asp Ala Asn Gly Arg *** Ala Cys Thr Val Gly Gly Leu Tyr Lys> 
d d d d d d CMV PROMOTER d d d d d d > 
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>BanII 
I 

>Sacl 
I 

>BsiHKAI 
I 

>Ecll36II 

I I 

I 820 



830 
* 



840 



850 



* j j * + * ★ * * * 

CAG AGC TCT CTG GCT AAC TAG AGA ACC CAC TGC TTA CTG GCT TAT CGA 
Gin Ser Ser Leu Ala Asn *** Arg Thr His Cys Leu Leu Ala Tyr Arg> 



d CMV PROMOTER d 



> 



>AseI 
i 

>T7 PROMOTER 



>BsaI 

i 

>Sf cl 



860 



870 



880 



>HindIII 



890 



>KpnI 

>Acc65I 

i 

>BanI 



900 



910 



AAT T AATACGA CTCACTATAG GGAGACCCAA GCTTCGCGCG GGTACCACTC 
Asn Xxx> 
d_> 

>Pf 1MI 



> 



>EarI 



>PvuII 
I 

>MspAlI 



920 



930 



940 
i * 



>BanII 
I 

950 
i * 



960 



970 



* I * * * * 

TCTTCCGCAT CGCTGTCTGC GAGGGCCAGC TGTTGGGCTC GCGGTTGAGG ACAAACTCTT 
e TRIPARTITE LEADER SEQUENCE e > 



>EarI 
I 



>ScaI 
I 



| 980 | 990 1000 1010 1020 1030 

J*-* * * * * * * ★ ★ * 

CGCGGTCTTT CCAGTACTCT TGGATCGGAA ACCCGTCGGC CTCCGAACGG TACTCCGCCA 
e TRIPARTITE LEADER SEQUENCE e 5 



>PpuMI 

I 

>Eco0109I 

! 

I 1040 
* i * 



1050 
* 



>BsiEI 
I 

>BsaWI 

I i 
1060 
* f i * 



>XhoI 
1 

>PaeR71 
! 

>BsoBI 
I 

>AvaI 



>Sf cl 
I 

>MspAlI 
i I 

>BsiEI 



>EaeI 
I 

>NotI 
I 

>EagI 
I 



1070 
* 



I 1080 



1090 
i i * 



CCGAGGGACC TGAGCGAGTC CGCATCGACC GGATCGGAAA ACCTCTCGAG GCGGCCGCTG 

TRIPARTITE LEADER SEQUENCE e > 
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>ApaI 
I 

>ClaI >EcoO109I 
I ! I 

>EcoRV| >Bspl20I >SfcI 



I 1100 
i * * 



1110 



>BspDI 
I I 

} 1120 
* i * 



>BanII 
I 

1 1130 



>MslI 



1140 



* * * 

CAGTCTAGAC GAATTCGCGT ACGATATCGA TGGGCCCTAT T CTA TAG TGT CAC CTA 

Leu *** Cys His Leu> 
SP6 PROMOTER > 

>BanII 
I 

>BsiHKAI 
I 

>SacI 
I 

>Ecll36II| >BclI 
I i 

>BGH POLY A i 



1150 



i 1160 



1170 



1180 



1190 



1200 



AAT G CTAGAGCTCG CTGATCAGCC TCGACTGTGC CTTCTAGTTG CCAGCCATCT 

Asn> 

> 

>BanI 

I 

1210 1220 1230 1240 1250 1260 

★ + * * * * * * » * * ★ 

GTTGTTTGCC CCTCCCCCGT GCCTTCCTTG ACCCTGGAAG GTGCCACTCC CACTGTCCTT 
1270 



1280 



1290 



1300 
★ * 



1310 
* 



1320 



TCCTAATAAA ATGAGGAAAT TGCATCGCAT TGTCTGAGTA GGTGTCATTC TATTCTGGGG 



1330 



1340 



1350 
* * 



1360 



! 1370 
i * * 



>BbsI 

1380 

GGTGGGGTGG GGCAGGACAG CAAGGGGGAG GATTGGGAAG ACAATAGCCG AAATGACCGA 

>BssSI 
I 

>BspMI 

1 

1390 1400 1 1410 1420 1430 

* * * ★ | ★ * * * * * 

CCAAGCGACG CCCAACCTGC CATCACGAGA TTTCGATTCC ACCGCCGCCT TCTAJGAAAG 

>NaeI 
i 

>BsrFI 

1 ! 
>BpmI 1 



1440 



>NgoMI 
i I 

1450 1460 1470 [ 1480 1490 1500 

★ * * * * * * * * 

GTTGGGCTTC GGAATCGTTT TCCGGGACGC CGGCTGGATG ATCCTCCAGC GCGGGGATCT 
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>3omI 
1 

>SV40_early poly_A 
I ^ 

1510 1520 11530 1540 1550 1560 

* * -*-|* * * *■ ★ ★ * 

CATGCTGGAG TTCTTCGCCC ACCCCAACTT GTTTATTGCA GCTTATAATG GTTACAAATA 

>ApoI >BsmI 

I i 
1570 11580 1590 1600 j 1610 1620 

AAGCAATAGC ATCACAAATT TCACAAATAA AGCATTTTTT TCACTGCATT CTAGTTGTGG 

>HincII 
1 

>Bstll07I >AccI 
i i 1 

>AccI >SalI 
II III 

1630 1640 1650 1660 | | 1 1670 1680 

* * * * * * * I I * 11* * * * 
TTTGTCCAAA CTCATCAATG TATCTTATCA TGTCTGTATA CCGTCGACCT CTAGCTAGAG 

> 

> >BsrBI 

I 

1690 1700 1710 1720 1730 1740 

* + * * * * -k ~x * j * * * 

CTTGGCGTAA TCATGGTCAT AGCTGTTTCC TGTGTGAAAT TGTTATCCGC TCACAATTCC 
c PUC19 BACKBONE H3 TO AATII c > 

>BanI 
I 

1750 1760 1770 1780 | 1790 1800 

* * * * * * * * | * * * * 

ACACAACATA CGAGCCGGAA GCATAAAGTG TAAAGCCTGG GGTGCCTAAT GAGTGAGCTA 
c PUC19 BACKBONE H3 TO AATII c > 

>AseI 
I 

1810 1820 1830 1840 1850 1860 

ACTCACATTA ATTGCGTTGC GCTCACTGCC CGCTTTCCAG TCGGGAAACC TGTCGTGCCA 
c PUC19 BACKBONE H3 TO AATII c > 

>PvuII 
1 

>MspAlI >AseI >EaeI >HaeII 

11! I 
i 1870 11880 1890 1900 1910 i 1920 

*|* * * * ★ * * * * 

GCTGCATTAA TGAATCGGCC AACGCGCGGG GAGAGGCGGT TTGCGTATTG GGCGCTCTTC 
c PUC19 BACKBONE H3 TO AATII c > 

>EarI 
I 

>SapI >BsiEI >BsrBI 

I I i 

| 1930 1940 1950 I 1960 1970 1980 

I** * * **|** * * * * 

CGCTTCCTCG CTCACTGACT CGCTGCGCTC GGTCGTTCGG CTGCGGCGAG CGGTATCAGC 
c PUC19 BACKBONE H3 TO AATII c > 
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>Af 1III 
j 

1990 2000 2010 2020 2030 2040 

■k * k -k * * * * * * * j * 

TCACTCAAAG GCGGTAATAC GGTTATCCAC AGAATCAGGG GATAACGCAG GAAAGAACAT 
c PUC19 BACKBONE H3 TO AATII c > 



2050 2060 2070 2080 2090 2100 

* * **■ *■ * *■ * * ■* 

GTGAGCAAAA GGCCAGCAAA AGGCCAGGAA CCGTAAAAAG GCCGCGTTGC TGGCGTTTTT 
c PUC19 BACKBONE H3 TO AATII c > 



>DrdI 

1 

2110 2120 2130 2140 i 2150 2160 

* * * * * * * * -k * * * 

CCATAGGCTC CGCCCCCCTG ACGAGCATCA CAAAAATCGA CGCTCAAGTC AGAGGTGGCG 
c PUC19 BACKBONE H3 TO AATII c > 



>BssSI 

i 

2170 2180 - 2190 2200 2210 2220 

*■ * * *• * * -k -k k k k ~k 

AAACCCGACA GGACTATAAA GATACCAGGC GTTTCCCCCT GGAAGCTCCC TCGTGCGCTC 
c PUC19 BACKBONE H3 TO AATII c > 



>BsaWl 
I 

2230 2240 | 2250 2260 2270 2280 

* * * * | * * * * -k -k k k 

TCCTGTTCCG ACCCTGCCGC TTACCGGATA CCTGTCCGCC TTTCTCCCTT CGGGAAGCGT 
c PUC19 BACKBONE H3 TO AATII c > 



>HaeII >SfcI 
I I 

i 2290 2300 i 2310 2320 2330 2340 

k k k tt j * * * * * * * * 

GGCGCTTTCT CAATGCTCAC GCTGTAGGTA TCTCAGTTCG GTGTAGGTCG TTCGCTCCAA 
c PUC19 BACKBONE H3 TO AATII c > 



>3siHKAI >MspAlI 

i I 

>ApaLI i >BsiEI >BsaWI 

i i I i 1 

2350 i I 2360 2370 2380 2390 2400 

★ ★ j * * * * * j i * * * * * 

GCTGGGCTGT GTGCACGAAC CCCCCGTTCA GCCCGACCGC TGCGCCTTAT CCGGTAACTA 

c PUC19 BACKBONE H3 TO AATII c > 



>AlwNI 
I 

2410 2420 2430 2440 2450 1 2460 

★ * *■ ★ * * ★ * | * * 

TCGTCTTGAG TCCAACCCGG TAAGACACGA CTTATCGCCA CTGGCAGCAG CCACTGGTAA 
c PUC19 BACKBONE H3 TO AATII c > 



>SfcI 

I 

2470 2480 2490 | 2500 2510 2520 

* * * * * * | * * * * * * 

CAGGATTAGC AGAGCGAGGT ATGTAGGCGG TGCTACAGAG TTCTTGAAGT GGTGGCCTAA 
c PUC19 BACKBONE H3 TO AATII c > 



FIG. 4 

(CONTINUED) 



13/51 

2530 2540 2550 2560 2570 2580 

★ * * ★ ** * * * * * * 

CTACGGCTAC ACTAGAAGGA CAGTATTTGG TATCTGCGCT CTGCTGAAGC CAGTTACCTT 
c PUC19 BACKBONE H3 TO AATII c > 



>Eco57I >MspAlI 
I 1 

I 2590 2600 2610 2620 i 2630 2640 

★ * * * * * * * |* * 

CGGAAAAAGA GTTGGTAGCT CTTGATCCGG CAAACAAACC ACCGCTGGTA GCGGTGGTTT 
c PUC19 BACKBONE H3 TO AATII c > 



2650 2660 2670 2680 2690 2700 

TTTTGTTTGC AAGCAGCAGA TTACGCGCAG AAAAAAAG G A TCTCAAGAAG ATCCTTTGAT 
c PUC19 BACKBONE H3 TO AATII c > 



>BspHI 

i 

2710 2720 2730 2740 2750 2760 

* * * * ■** * * ★ * * j * 

CTTTTCTACG GGGTCTGACG CTCAGTGGAA CGAAAACTCA CGTTAAGGGA TTTTGGTCAT 
C PUC19 BACKBONE H3 TO AATII c > 



>DraI >DraI 

I i 
2770 2780 2790 |2800 2810 i 2820 

+ * * * * ★ *j* * * * * 

GAGATTATCA AAAAGGATCT TCACCTAGAT CCTTTTAAAT TAAAAATGAA GTTTTAAATC 
c PUC19 BACKBONE H3 TO AATII c > 



>BanI 
i 

2830 2840 2850 2860 2870 2880 

* * * * * * * * *■ *• ★ j * 

AATCTAAAGT ATATATGAGT AAACTTGGTC TGACAGTTAC CAATGCTTAA TCAGTGAGGC 

a AMP-ORF > 

C PUC19 BACKBONE H3 TO AATII c > 



>Ahdl 
I 

2890 2900 2910 2920 2930 2940 

■*■*■ * * . * * ** ★ * ★ ★ 

ACCTATCTCA GCGATCTGTC TATTTCGTTC ATCCATAGTT GCCTGACTCC CCGTCGTGTA 

a a AMP-ORF a a > 

c PUC19 BACKBONE H3 TO AATII c > 



>BsaI 
I 

>BsrDI >BpmI 
I I 

2950 2960 2970 2980 2990 1 3000 

GATAACTACG ATACGGGAGG GCTTACCATC TGGCCCCAGT GCTGCAATGA TACCGCGAGA 

a a AMP-ORF a a > 

C PUC19 BACKBONE H3 TO AATII c > 



>BsrFl >BG1I 
i f 
3010 3020 3030 3040 3050 3060 

** * *■ * ★ ★ 

CCCACGCTCA CCGGCTCCAG ATTTATCAGC AATAAACCAG CCAGCCGGAA GGGCCGAGCG 

a a AMP-ORF a a > 

c PUC19 BACKBONE H3 To AATII c > 

FIG. 4 
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>AseI 

i 

3070 3080 3090 3100 ( 3110 3120 

* * * * * * * * | *T * * * 

CAGAAGTGGT CCTGCAACTT TATCCGCCTC CATCCAGTCT ATTAATTGTT GCCGGGAAGC 

a a AMP-ORF a a : 

c PUC19 BACKBONE H3 TO AATII c : 

>Pspl406I • 



>FspI 

I 

3130 3140 3150 { 

* * * ★ * * * 



>BsrDl >Sfcl 
i I 
3150 J 3170 1 3180 

* * * t * * 



TAGAGTAAGT AGTTCGCCAG TTAATAGTTT GCGCAACGTT GTTGCCATTG CTACAGGCAT 

a a AM P -0 RF a a > 

c PUC19 BACKBONE H3 TO AATII c > 



>MslI >BsaWI 
I - I 

! 3190 3200 3210 3220 [ 3230 3240 

CGTGGTGTCA CGCTCGTCGT TTGGTATGGC TTCATTCAGC TCCGGTTCCC AACGATCAAG 

a a AMP-ORF a a > 

c PUC19 BACKBONE H3 TO AATII c > 



>PvuI 
I 

>BsiEI 

i 

3250 3260 3270 3280 3290 3300 

* * * * -k if ★ ★ * * ★ * 

GCGAGTTACA TGATCCCCCA TGTTGTGCAA AAAAGCGGTT AGCTCCTTCG GTCCTCCGAT 

a a AMP-0 R F a a > 

c PUC19 BACKBONE H3 TO AATII c > 



>EaeI >MslI 

I i 
3310 3320 3330 3340 | 3350 3360 

* * * j ★ * ★ * * j * ★ * * 

CGTTGTCAGA AGTAAGTTGG CCGCAGTGTT ATCACTCATG GTTATGGCAG CACTGCATAA 

a a AMP-ORF a a > 

c PUC19 BACKBONE H3 TO AATII c > 



>ScaI 

I 

3370 3380 3390 3400 3410 3420 

** * * *■* * * * 

TTCTCTTACT GTCATGCCAT CCGTAAGATG CTTTTCTGTG ACT IGGTGAGT ACTCAACCAA 

a a AMP-ORF a a > 

c PUC1 9 BACKBONE H3 TO AATII c > 

>BsiEI 
I 

3430 3440 3450 3460 3470 3480 

*■ * ** *j* ** *■ -k * * 

GTCATTCTGA GAATAGTGTA TGCGGCGACC GAGTTGCTCT TGCCCGGCGT CAATACGGGA 

a a AMP-ORF a a > 

C PUC19 BACKBONE H3 TO AATII c > 
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>XmnI 
i 

>DraI >BsiKKAI >Pspl406I 
I ! I 

3490 3500 3510 13520 3530 3540 

* ★ * * *j* *j* *|* * * 

TAATACCGCG CCACATAGCA GAACTTTAAA AGTGCTCATC ATTGGAAAAC GTTCTTCGGG 

a a AMP-ORF a a > 

c PUC19 BACKBONE H3 TO AATII c > 



>Eco571 
1 

>ApaLI 

i 

>MspAlI >3ssSI 
I I I 

3550 3560 I 3570 3580 3590 | 3600 

* * * * * * * * * * 

GCGAAAACTC TCAAGGATCT TACCGCTGTT GAGATCCAGT TCGATGTAAC CCACTCGTGC 

a a AMP-ORF a a > 

C PUC19 BACKBONE H3 TO AATII c > 



>BsiHKAI 
i 

| 3610 3620 3630 3640 3650 3660 

j** * * * * * * * ★ * * 

ACCCAACTGA TCTTCAGCAT CTTTTACTTT CACCAGCGTT TCTGGGTGAG CAAAAACAGG 

a a AM P -O RF a a > 

c PUC19 BACKBONE H3 To AATII c > 



>MslI 
I 

3670 3680 3690 3700 3710 3720 

* * * * * * -k ie * * * * 

AAGGCAAAAT GCCGC AAAAA AGGGAATAAG GGCGACACGG AAATGTTGAA TACTCATACT 

a a AMP-ORF a a > 

c PUC19 BACKBONE H3 TO AATII c > 



>EarI >SspI >BspHI >BsrBI 

II i I 

j 3730 i 3740 3750 3760 I 3770 t 3780 

CTTCCTTTTT CAATATTATT GAAGCATTTA TCAGGGTTAT TGTCTCATGA GCGGATACAT 
c PUC19 BACKBONE H3 TO AATII c > 



3790 3800 3810 3820 3830 3840 

** ** ** * * * * ★ ★ 

ATTTGAATGT ATTTAGAAAA ATAAACAAAT AGGGGTTCCG CGCACATTTC CCCGAAAAGT 
c PUC19 BACKBONE H3 TO AATII c > 



>HincII 

I 

>AccI 
i I 

>AatII 
I i 

>SalI 

I I I 

3850 j i i 
* Mi 

GCCACCTGAC GTC 
c > 
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>HincII 
I 

>AccI 
I I 

>BglII >SalI 
I IN 
10 | 20 30 i | ! 40 50 60 

* * I * * ★ * [ | j* * ★ * * * 

GACGGATCGG GAGATCTCCC GATCCCCTAT GGTCGACTCT CAGTACAATC TGCTCTGATG 

>AlwNI 
I 

70 80 90 100 110 120 

CCGCATAGTT AAGCCAGTAT CTGCTCCCTG CTTGTGTGTT GGAGGTCGCT GAGTAGTGCG 



>ApoI >MfeI 

I i 

1130 140 150 160 ! 170 180 

★ j* * * ★ * * * j * * * * 

CGAGCAAAAT TTAAGCTACA ACAAGGCAAG GCTTGACCGA CAATTGCATG AAGAATCTGC 



>HincII 

>AflIII 
I 

>NruI >MluI 

i I 
190 200 i 210 220 | 230 

* *• * *j* * ★ * | * 

TTAGGGTTAG GCGTTTTGCG CTGCTTCG CGA TGT ACG GGC CAG ATA TAG GCG TTG 

Arg Cys Thr Gly Gin lie Tyr Ala Leu> 
e e CMV PROMOTER e e > 



>SpeI >AseI 

I i 

240 250 | 260 270 280 

* *j* * j * * * * * 

ACA TTG ATT ATT GAC TAG TTA TTA ATA GTA ATC AAT TAC GGG GTC ATT 
Thr Leu lie lie Asp *** Leu Leu He Val He Asn Tyr Gly Val Ile> 
e e e e e e CMV PROMOTER e e e^e e e > 



^ 290 300 310 320 330 

* * * * * * * * 

AGT TCA TAG CCC ATA TAT GGA GTT CCG CGT TAC ATA ACT TAC GGT AAA 
Ser Ser *** Pro He Tyr Gly Val Pro Arg Tyr He Thr Tyr Gly Lys> 
e e e e e e CMV PROMOTER e e e e e e > 



>BglI >AatII 

i i 
340 350 360 370 ! 

* ★ ■* * * * * * * 

TGG CCC GCC TGG CTG ACC GCC CAA CGA CCC CCG CCC ATT GAC GTC AAT 
Trp Pro Ala Trp Leu Thr Ala Gin Arg Pro Pro Pro He Asp Val Asn> 
e e e e e e CMV PROMOTER e e e e e e > 



380 390 400 410 420 

* * * ★* * * * * 

AAT GAC GTA TGT TCC CAT AGT AAC GCC AAT AGG GAC TTT CCA TTG ACG 
Asn Asp Val Cys Ser His Ser Asn Ala Asn Arg Asp Phe Pro Leu Thr> 
e e e e e e CMV PROMOTER e e e e e e > 
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>AatII >BglI 

I ! 

430* 440 450 460 | 470 

I* * * * ★ *• * ★ 

TCA ATG GGT GGA CTA TTT ACG GTA AAC TGC CCA CTT GGC AGT ACA TCA 
Ser Met Gly Gly Leu Phe Thr Val Asn Cys Pro Leu Gly Ser Thr Ser> 
e e e e e e_CMV PROMOTER e e e e e e > 

>Ndel >AatII 

i I 
480 i 490 500 510 j 520 

★ | * * * * * ir | * * 

AGT GTA TCA TAT GCC AAG TAC GCC CCC TAT TGA CGT CAA TGA CGG TAA 
Ser Val Ser Tyr Ala Lys Tyr Ala Pro Tyr *** Arg Gin *** Arg ***> 
e e e e e e_CMV PROMOTER e e e e e e > 

>BglI 
i 

530 | 540 550 560 570 

* *j* * ★ * * * ★ * 

ATG GCC CGC CTG GCA TTA TGC CCA GTA CAT GAC CTT ATG GGA CTT TCC 
Met Ala Arg Leu Ala Leu Cys Pro Val His Asp Leu Met Gly Leu Ser> 
e e e e e eJZHV PROMOTER e e e e e e > 

>BsaAI >NcoI 

t I 

>SnaBI >StyI >MslI 

! I I 

580 590 600 610 | 

* yr * * * ★ * * * 

TAC TTG GCA GTA CAT CTA CGT ATT AGT CAT CGC TAT TAC CAT GGT GAT 
Tyr Leu Ala Val His Leu Arg lie Ser His Arg Tyr Tyr His Gly Asp> 
e e e e e e_CMV PROMOTER e e e e e e > 

620 630 640 650 660 

GCG GTT TTG GCA GTA CAT CAA TGG GCG TGG ATA GCG GTT TGA CTC ACG 
Ala Val Leu Ala Val His Gin Trp Ala Trp lie Ala Val *** Leu Thr> 
e e e e e e_CMV PROMOTER e e e e e e > 

>AatII >BanI 
! ! 
670 680 690 j 700 710 [ 

* * * ★ * *j* -k * * 

GGG ATT TCC AAG TCT CCA CCC CAT TGA CGT CAA TGG GAG TTT GTT TTG 
Gly lie Ser Lys Ser Pro Pro His *~* Arg Gin Trp Glu Phe Val Leu> 
e e e e e e_CMV PROMOTER e e e e e e > 

720 730 740 750 760 

* * *■ ★ * * *■ ★ *■ 

GCA CCA AAA TCA ACG GGA CTT TCC AAA ATG TCG TAA CAA CTC CGC CCC 
Ala Pro Lys Ser Thr Gly Leu Ser Lys Met Ser *** Gin Leu Arg Pro> 
e e e e e e_CMV PROMOTER e e e e e e > 

770 780 790 800 810 

★ * ★ * * * ★ * * * 

ATT GAC GCA AAT GGG CGG TAG GCG TGT ACG GTG GGA GGT CTA TAT AAG 
lie Asp Ala Asn Gly Arg ***" Ala Cys Thr Val Gly Gly Leu Tyr Lys> 
e e e e e e CMV PROMOTER e e e e e e > 
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>SacI 
i 

Banll 

i 

>BsiHKAI 
I 

>Ecll36II 

I I 
I 820 
i * 



830 



840 



850 



* j | * * * * * * * 

CAG AGC TCT CTG GCT AAC TAG AGA ACC CAC TGC TTA CTG GCT TAT CGA 
Gin Ser Ser Leu Ala Asn *** Arg Thr His Cys Leu Leu Ala Tyr Arg> 
e e e e e e CMV PROMOTER e e e e e e > 



>AseI 
i 

>T7_PROMOTER 

i i 

860 I I 870 
* i i *• * 



>BsaI 
i 

>Sf cl 



880 



>HindIII 
I 
I 
I 

890 
I* 



>KpnI 

>Acc65I 
I 

>BanI 



900 f t 910 



AAT T AATACGA CTCACTATAG GGAGACCCAA GCTTCGCGCG GGTACCACTC 
Asn Xxx> 
e > 



>Pf 1MI 



>EarI 

I 

I 920 



930 
* 



>PvuII 
I 

940 



>BanII 
( 

950 



960 



970 



TCTTCCGCAT CGCTGTCTGC GAGGGCCAGC TGTTGGGCTC GCGGTTGAGG ACAAACTCTT 
f f TRIPARTITE LEADER f f > 



>EarI >ScaI 
i 1 

| 980 i 990 1000 1010 1020 1030 

j*-* * * * ~K ★ * * * * ★ 

CGCGGTCTTT CCAGTACTCT TGGATCGGAA ACCCGTCGGC CTCCGAACGG TACTCCGCCA 
f f TRIPARTITE LEADER f f > 



>EcoO109I 
i 

>PpuMI 
i 

! 1040 
★ i * 



1050 
* 



>BsiEI 

>BsaWI 

I i 
1060 
it* 



1070 



>BsoBI 
I 

>AvaI 
I 

>XhoI 
I 

>PaeR7l 
i 

j 1080 



1090 



CCGAGGGACC TGAGCGAGTC CGCATCGACC GGATCGGAAA ACCTCTCGAG GAACTGAAAA 



> 



TRIPARTITE LEADER 



> 
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>HincII >Eco0109I >BsaWI 

I 1 I 

>HpaI >PpuMI >3amHI 

i "ill 

1100 1 1110 1120 1130 1140 I 1150 

ACCAGAAAGT TAACTGGTAA GTTTAGTCTT TTTGTCTTTT TATTTCAGGT CCCGGATCCG 

b HYBRID SV4 0 LATE INTRON b > 



>BseRI 

\ 

1160 11170 1180 

* * * j * ★ * 

GTGGTGGTGC AAATCAAAGA ACTGCTCCTC 
b HYBRID SV4 0 



>StuI 
! 

1190 1200 j 1210 

* * * * *■ * 

AGTGGATGTT GCCTTTACTT CTAGGCCTGT 
LATE INTRON b > 



>BsiEI 
i 

>EagI i 

t i 

>EaeIi >XbaI 

i i I 
>SacII >PstI 

I I I I 
>NotI i >SfcI I I 

II f I I 

1220 1230 1240 1250 i 1260 ! | 1270 

* * * * * * * * * j * j * | * 

ACGGAAGTGT TACTTCTGCT CTAAAAGCTG CGGAATTGTA CCCGCGGCCG CTGCAGTCTA 
HYBRID SV40 LATE INTRON b > 



>ApaI 

f 

>BspDI >EcoO109I 
1 i i 

>ApoI >EcoRV| >Bspl20I 

I i I 11! 

>EcoRI >BsiWI >ClaI I>BanII >SfcI >MslI 
I 1 1 i 1 I 1 I i 

| 1280 1 1290 111300 I 1310 1320 

j* * j j * * * * I* 

GACGAATTCG CGTACGATAT CGATGGGCCC TATT CTA TAG TGT CAC CTA AAT 

Leu *** Cys His Leu Asn> 
c SP6 PROMOTER c > 



>SacI 
1 

>BanII 
1 

>BsiHKAI 
1 

>Ecll36II i >BclI 

I ! ! 
>BGH_POLY_A [ | 

1 I i I 

I 1330 | | 1340 1350 1360 1370 1380 

j *■ j * j * ★ * * * * * * 

GCTAGAGC TCGCTGATCA GCCTCGACTG TGCCTTCTAG TTGCCAGCCA TCTGTTGTTT 



1390 1400 
* * * * 

GCCCCTCCCC CGTGCCTTCC 



>BanI 
I 

1410 1 1420 

-x * j * * 

TTGACCCTGG AAGGTGCCAC 



1430 1440 
* * ★ ★ 

TCCCACTGTC CTTTCCTAAT 
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1450 



1460 



1470 



1430 



1490 



1500 



AAAATGAGGA AATTGCATCG CATTGTCTGA GTAGGTGTCA TTCTATTCTG GGGGGTGGGG 

>BbsI 



1510 



1520 



1530 



1540 



1550 



1560 



1570 



1580 



TGGGGCAGGA CAGCAAGGGG GAGGATTGGG AAGACAATAG CCGAAATGAC CGACCAAGCG 

>BspMI 

i 

>BssSI 

1590 1600 1610 1620 

* •* * * * * 

ACGCCCAACC TGCCATCACG AGATTTCGAT TCCACCGCCG CCTTCTATGA AAGGXTGGGC 

>NaeI 
I 

>NgoMI 

I i 
>BpmI 

i i 
>BsrFI 

I 

I 1650 1660 1670 1680 

I** * * * * * * 

TTCGGAATCG TTTTCCGGGA CGCCGGCTGG ATGATCCTCC AGCGCCGGGA TCTCATGCTG 

>BpmI 

>SV4 0_early_poly_A 
I 

1690 1700 1710 1720 1730 1740 

* * * I * * * * * * * 

GAGTTCTTCG CCCACCCCAA CTTGTTTATT GCAGCTTATA ATGGTTACAA ATAAAGCAAT 
>ApoI >BsmI 
1760 



1630 



1640 



1780 

★ * * * -*■ | * 

AGCATCACAA ATTTCACAAA TAAAGCATTT TTTTCACTGC ATTCTAGTTG TGGTTTGTCC 



1750 



1770 



1790 



1800 



>KincII 
I 

>Bstll07I >AccI 



1810 



1820 



>AccI 
i 

1830 
* ★ 



>SalI 
i i ! 

1 1840 



1850 



1860 



AAACTCATCA ATGTATCTTA TCATGTCTGT ATA.CCGTCGA CCTCTAGCTA GAGCXTGGCG 



>BsrBI 



1870 



1880 



1890 



1900 



1910 



1920 



TAATCATGGT CATAGCTGTT TCCTGTGTGA AATTGTTATC CGCTCACAAT TCCACACAAC 
d d PUC19 BACKBONE d d > 
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>BanI 

I 

1930 1940 1950 I 1960 1970 1980 

* * * ★ * ★ j * * *■ * ★ * 

ATACGAGCCG GAAGCATAAA GTGTAAAGCC TGGGGTGCCT AATGAGTGAG CTAACTCACA 
d d PUC19 BACKBONE d d > 

>AseI 

I 

>AseI >PvuII | 

I 1 i 

I 1990 2000 2010 2020 2030 ! 2040 

I** * * * ★ *■*!*■* 

TTAATTGCGT TGCGCTCACT GCCCGCTTTC CAGTCGGGAA ACCTGTCGTG CCAGCTGCAT 
d d PUC19 BACKBONE d d > 

>EarI 

i 

>EaeI >HaeII >SapI 

I I ! 

2050 2060 2070 2080 2090 i 2100 

*j* * * * * * * j * j* * 

TAATGAATCG GCCAACGCGC GGGGAGAGGC GGTTTGCGTA TTGGGCGCTC TTCCGCTTCC 
d d PUC19 BACKBONE d_ d > 

>BsiEI >Bsr3I 

I i 
2110 2120 12130 2140 i 2150 2160 

* * * * * j * * * j * * * ■* 

TCGCTCACTG ACTCGCTGCG CTCGGTCGTT CGGCTGCGGC GAGCGGTATC AGCTCACTCA 
d d PUC19 BACKBONE d d > 

>AfiIII 
i 

2170 2180 2190 2200 2210 2220 

* * * ★ * * * * * * * * 

AAGGCGGTAA TACGGTTATC CACAGAATCA GGGGATAACG CAGGAAAGAA CATGTGAGCA 
d d PUC19 BACKBONE d d > 

2230 2240 2250 2260 2270 2280 

* •* ★■*■ ★ * * * 

AAAGGCCAGC AAAAGGCCAG GAACCGTAAA AAGGCCGCGT TGCTGGCGTT TTTCCATAGG 
d d PUC19 BACKBONE d d > 

>DrdI 

i 

2290 2300 2310 2320 2330 2340 

* * * * *j* * * * * 

CTCCGCCCCC CTGACGAGCA TCACAAAAAT CGACGCTCAA GTCAGAGGTG GCGAAACCCG 
d d PUC19 BACKBONE d d > 

>BssSI 
I 

2350 2360 2370 2380 | 2390 2400 

* ★ * * * * * * | * * * ★ 

ACAGGACTAT AAAG AT AC C A GGCGTTTCCC CCTGGAAGCT CCCTCGTGCG CTCTCCTGTT 
d d PUC19 BACKBONE d d > 
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>BsaWI >HaeII 

i i 

2410 12420 2430 2440 2450 2460 

* * * ★ * * * * * * * j ★ 

CCGACCCTGC CGCTTACCGG ATACCTGTCC GCCTTTCTCC CTTCGGGAAG CGTGGCGCTT 
d d PUC19 BACKBONE d d > 



>SfcI 

I 

2470 i 2480 2490 2500 2510 2520 

*★ * * * * -r -k * * * * 

TCTCAATGCT CACGCTGTAG GTATCTCAGT TCGGTGTAGG TCGTTCGCTC CAAGCTGGGC 
d d PUC19 BACKBONE d d > 



>BsiHKAI 
I 

>ApaLI | >BsiEI >BsaWI 

I i i I 

I 2530 2540 2550 2560 I 2570 2580 

*• * * * * *j* * * ★ 

TGTGTGCACG AACCCCCCGT TCAGCCCGAC CGCTGCGCCT TATCCGGTAA CTATCGTCTT 

d d PUC19 BACKBONE d d > 



>AlwNI 
I 

2590 2600 2610 2620 | 2 630 2640 

* * *•* * *> ★ ★ ★ j * *■ * 

GAGTCCAACC CGGTAAGACA CGACTTATCG CCACTGGCAG CAGCCACTGG TAACAGGATT 
d d PUC19 BACKBONE d d > 



>SfcI 

i 

2650 2660 12670 2680 2690 2700 

* ★ * * * j * * * * * * ★ 

AGCAGAGCGA GGTATGTAGG CGGTGCTACA GAGTTCTTGA AGTGGTGGCC TAACTACGGC 
d d PUC19 BACKBONE d d > 



>Eco57I 

2710 2720 2730 2740 2750 2760 

* * * * * * * * * * *j* 

TACACTAGAA GGACAGTATT TGGTATCTGC GCTCTGCTGA AGCCAGTTAC CTTCGGAAAA 
d d PUC19 BACKBONE d d > 



2770 2780 2790 2800 2810 2820 

* * ★ * * * * *■* 

AGAGTTGGTA GCTCTTGATC CGGCAAACAA ACCACCGCTG GTAGCGGTGG TTTTTTTGTT 
d d PUC19 BACKBONE d d > 



2830 2840 2850 2860 2870 2880 

* * * * * * * * * * * * 

TGCAAGCAGC AGATTACGCG CAGAAAAAAA GGATCTCAAG AAGATCCTTT GATCTTTTCT 
______ d d PUC19 BACKBONE d d > 



>BspHI 
I 

2890 2900 2910 2920 2930 2940 

* * * * * * * * * * *- ★ 

ACGGGGTCTG ACGCTCAGTG GAACGAAAAC TCACGTTAAG GGATTTTGGT CATGAGATTA 
d d PUC19 BACKBONE d d > 
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>DraI >DraI 
i I 
2950 2960 2970 2980 2990 3000 

* * * * * j * * ★ * j * * ie 

TCAAAAAG G A TCTTCACCTA GATCCTTTTA AATTAAAAAT GAAGTTTTAA ATCAATCTAA 
d d PUC19 BACKBONE d d > 



>BanI 
I 

3010 3020 3030 3040 3050 I 3060 

* ★ * * * * * * * * j * ★ 

AGTATATATG AGTAAACTTG GTCTGACAGT TACCAATGCT TAATCAGTGA GGCACCTATC 

_a AMP-ORF a > 

d d PUC19 BACKBONE d d > 



>AhdI 
i 

3070 3080 3090 3100 j 3110 3120 

* * * * * ★ * * j * * * * 

TCAGCGATCT GTCTATTTCG TTCATCCATA GTTGCCTGAC TCCCCGTCGT GTAGATAACT 

a a AMP-ORF a a > 

d d PUC19 BACKBONE d d > 



>3sal 

! 

>BsrDI >BpmI 
! I 

3130 3140 3150 3160 1 3170 | 3180 

*> * ■*■ ★ * * * |+ * j * * 

ACGATACGGG AGGGCTTACC ATCTGGCCCC AGTGCTGCAA TGATACCGCG AGACCCACGC 

a a AM P-OR F a a > 

d d PUC19 BACKBONE d d > 



>BsrFI >BglI 

! I 

I 3190 3200 3210 3220 | 3230 3240 

I * * * * * * * * j *■ * * * 

TCACCGGCTC CAGATTTATC AGCAATAAAC CAGCCAGCCG GAAGGGCCGA GCGCAGAAGT 

a a AMP-ORF a a > 

d d PUC19 BACKBONE d d > 



>AseI 
I 

3250 3260 3270 I 3280 3290 3300 

* *• * * * * * * * * 

GGTCCTGCAA CTTTATCCGC CTCCATCCAG TCTATTAATT GTTGCCGGGA AGCTAGAGTA 

a a AMP-ORF a a > 

d d PUC19 BACKBONE d d > 



>Pspl406I 

i 

>FspI i >BsrDI >SfcI >MslI 

i 1 i i 1 

3310 3320 1 3330 3340 | 3350 j 3360 

-* ★ * * * | * * j * j* -k * * 

AGTAGTTCGC CAGTTAATAG TTTGCGCAAC GTTGTTGCCA TTGCTACAGG CATCGTGGTG 

a a AMP-ORF a a > 

d d PUC19 BACKBONE d d > 
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>SsaWI 

i 

3370 3380 3390 I 3400 3410 3420 

* * * * * * J** * * 

TCACGCTCGT CGTTTGGTAT GGCTTCATTC AGCTCCGGTT CCCAACGATC AAGGCGAGTT 

a a AMP-ORF a a > 

d d PUC19 BACKBONE d d > 



>BsiEI 
I 

>PvuI 
I 

3430 3440 3450 3460 3470 j 3480 

* * * ★ * * * * * * j * * 

ACATGATCCC CCATGTTGTG CAAAAAAGCG GTTAGCTCCT TCGGTCCTCC GATCGTTGTC 

a a AMP-ORF a a > 

d d PUC19 BACKBONE d d > 



>EaeI >MslI 

I I 
3490 1 3500 3510 I 3520 3530 3540 

* * | * * * * j** * * * * 

AGAAGTAAGT TGGCCGCAGT GTTATCACTC ATGGTTATGG CAGCACTGCA TAATTCTCTT 

a a AMP - 0 RF a a > 

d d PUC19 BACKBONE d d > 



>ScaI 
1 

3500 3560 3570 3580 | 3590 3600 

■W* * * * * j * *- * * 

ACTGTCATGC CATCCGTAAG ATGCTTTTCT GTGACTGGTG AGTACTCAAC CAAGTCATTC 

a a AMP-ORF a a > 

d d PUC19 BACKBONE d d > 



>BsiEI 
I 

3610 3620 | 3630 3640 3650 3660 

* * **!**■ * * ■** * * 

TGAGAATAGT GTATGCGGCG ACCGAGTTGC TCTTGCCCGG CGTCAATACG GGATAATACC 

a a AMP-ORF a a > 

d d PUC19 BACKBONE d d > 



>Pspl406I 

i 

>DraI >BsiHKAI >XmnI 

I I I 

3670 3680 3690 3700 } 3710 3720 

★ * ★ * *|* * * j * * * * 

GCGCCACATA GCAGAACTTT AAAAGTGCTC ATCATTGGAA AACGTTCTTC GGGGCGAAAA 

a a AMP-ORF a a > 

d d PUC19 BACKBONE d d > 



>ApaLI 
I 

>Eco57I 
I 

>BssSI i >BsiHKAI 

I 1 I 

3730 3740 3750 3760 3770 i 3780 

* * * * * * * *■ * f * |* * 

CTCTCAAGGA TCTTACCGCT GTTGAGATCC AGTTCGATGT AACCCACTCG TGCACCCAAC 

a a AMP-ORF a a > 

d d PUC19 BACKBONE d d > 
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3790 3800 3810 3820 3830 3840 

** * * * * * * * * * * 

TGATCTTCAG CATCTTTTAC TTTCACCAGC GTTTCTGGGT GAGCAAAAAC AGGAAGGCAA 

a a AMP-ORF a a > 

d d PUC19 BACKBONE d d > 



>MslI >EarI 

i I 
3850 3860 3870 I 3880 3890 3900 

AATGCCGCAA AAAAGGGAAT AAGGGCGACA CGGAAATGTT GAATACTCAT ACTCTTCCTT 

a a AMP-ORF a > 

d d PUC19 BACKBONE d d > 

>SspI >BspHI >BsrBI 

1 i I 

3910 3920 3930 3940 I 3950 3960 

*j* * * * * * | * * * 

TTTCAATATT ATTGAAGCAT TTATCAGGGT TATTGTCTCA TGAGCGGATA CATATTTGAA 
d d PUC19 BACKBONE d d > 

3970 3980 3990 4000 4010 4020 

* * * ★ ** * * ** * * 

TGTATTTAGA AAAATAAACA AATAGGGGTT CCGCGCACAT TTCCCCGAAA AGTGCCACCT 
d d PUC19 BACKBONE d d > 

>HincII 
I 

>AatII 
i i 

>AccI 
I ! 

>SalI 
I I ! 
I* i 
GACGTC 

> 
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>HincII 

t 

i 

>ACCl 

1 1 

>BglII >SalI 
| Ml 
10 i 20 30 M I 40 50 60 

* * I * ~ * * * * * * * 

GACGGATCGG GAGATCTCCC GATCCCCTAT GGTCGACTCT CAGTACAATC TGCTCTGATG 

>AlwNI 

70 80 90 100 110 120 

* * * * * * * * * * 

CCGCATAGTT AAGCCAGTAT CTGCTCCCTG CTTGTGTGTT GGAGGTCGCT GAGTAGTGCG 

>ApoI >MfeI 

| ! 

1130 140 150 160 1 170 180 

* 1 * * * * * * * j * * „ * 

CGAGCAAAAT TTAAGCTACA ACAAGGCAAG GCTTGACCGA CAATTGCATG AAGAATCTGC 



>HincII 



>Af 1III 
I 

>NruI >MluI 
I i 
190 200 1210 220 1230 



* 



* 



TTAGGGTTAG GCGTTTTGCG CTGCTTCG CGA TGT ACG GGC CAG ATA TAC GCG TTG 

Arg Cys Thr Gly Gin lie Tyr Ala Leu> 
f f CMV PROMOTER f f > 

>SpeI >AseI 
I I 

240 250 ! 260 270 280 

* *j* *|** * * * 

ACA TTG ATT ATT GAC TAG TTA TTA ATA GTA ATC AAT TAC GGG GTC ATT 
Thr Leu lie He Asp *** Leu Leu He Val He Asn Tyr Gly Val Xle> 
f , f f f f f_CMV PROMOTER f f f f f f > 

290 300 310 320 330 
*********** 
AGT TCA TAG CCC ATA TAT GGA GTT CCG CGT TAC ATA ACT TAC GGT AAA 
Ser Ser *** Pro He Tyr Gly Val Pro Arg Tyr He Thr Tyr Gly Lys> 
f f f f f f_CMV PROMOTER f f f f f f > 

>BglI >AatII 

I i 
340 350 360 370 I 

* * * * * * * * * 

TGG CCC GCC TGG CTG ACC GCC CAA CGA CCC CCG CCC ATT GAC GTC AAT 
Trp Pro Ala Trp Leu Thr Ala Gin Arg Pro Pro Pro He Asp Val Asn> 
f f f f f f CMV PROMOTER f f f f f f > 



380 390 400 410 420 

******~*** 
AAT GAC GTA TGT TCC CAT AGT AAC GCC AAT AGG GAC TTT CCA TTG ACG 
Asn Asp Val Cys Ser His Ser Asn Ala Asn Arg Asp Phe Pro Leu Thr> 
f f f f f f CMV PROMOTER f f f f f f > 
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>AatII >3gll 

i I 

430 440 450 460 i 470 

I * * ★ * * * *j* * * 

TCA ATG GGT GGA CTA TTT ACG GTA AAC TGC CCA CTT GGC AGT ACA TCA 
Ser Met Gly Gly Leu Phe Thr Val Asn Cys Pro Leu Gly Ser Thr Ser> 
f f f f f f_CMV PROMOTER f f f f f f > 

>NdeI >AatII 

i i 
480 ! 490 500 510 | 520 



AGT GTA TCA TAT GCC AAG TAC GCC CCC TAT TGA CGT CAA TGA CGG TAA 
Ser Val Ser Tyr Ala Lys Tyr Ala Pro Tyr *** Arg Gin *** Arg ***> 
f f f f f f_CMV PROMOTER f f f f f f > 

>BglI 

530 i 540 550 560 570 

* *j* * * * * * * * 

ATG GCC CGC CTG GCA TTA TGC CCA GTA CAT GAC CTT ATG GGA CTT TCC 
Met Ala Arg Leu Ala Leu Cys Pro Val His Asp Leu Met Gly Leu Ser> 
f f f f f f__CMV PROMOTER f f f f f f > 

>BsaAI >StyI 

I I 
>SnaBI >NcoI >MslI 

i i i 

580 590 600 610 I 

* * * * * * * * * 

TAC TTG GCA GTA CAT CTA CGT ATT AGT CAT CGC TAT TAC CAT GGT GAT 
Tyr Leu Ala Val His Leu Arg lie Ser His Arg Tyr Tyr His Gly Asp> 
f f f f f f_CMV PROMOTER f f f f f f > 

620 630 640 650 660 



GCG GTT TTG GCA GTA CAT CAA TGG GCG TGG ATA GCG GTT TGA CTC ACG 
Ala Val Leu Ala Val His Gin Trp Ala Trp He Ala Val *** Leu Thr> 
f f f f f f_CMV PROMOTER f f f f f f > 

>AatII >BanI 
I * I 

670 680 690 ! 700 710 I 

* * * * * *|* * ★ ★ 

GGG ATT TCC AAG TCT CCA CCC CAT TGA CGT CAA TGG GAG TTT GTT TTG 
Gly He Ser Lys Ser Pro Pro His *** Arg Gin Trp Glu Phe Val Leu> 
f f f f f f_CMV PROMOTER f f f f f f > 

720 730 740 750 760 

* * * * * * * ★ * 

GCA CCA AAA TCA ACG GGA CTT TCC AAA ATG TCG TAA CAA CTC CGC CCC 
Ala Pro Lys Ser Thr Gly Leu Ser Lys Met Ser Gin Leu Arg Pro> 
f f f f f f_CMV PROMOTER f f f f f f > 

770 780 790 800 810 

* ****** *** 

ATT GAC GCA AAT GGG CGG TAG GCG TGT ACG GTG GGA GGT CTA TAT AAG 
He Asp Ala Asn Gly Arg *** Ala Cys Thr Val Gly Gly Leu Tyr Lys> 
f f f f f f CMV PROMOTER f f f f f f > 
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>BsiHKAI 
I 

SacI 

I 

Banll 
I 

>Ecll36II 
I i 

| 820 830 840 

* * * * * * * 

CAG AGC TCT CTG GCT AAC TAG AGA ACC CAC TGC TTA CTG GCT TAT CGA 
Gin Ser Ser Leu Ala Asn *** Arg Thr His Cys Leu Leu Ala Tyr Arg> 
f f f f f f CMV PROMOTER f f f f f f > 



850 



>AseI 
I 

>T7 PROMOTER 



>BsaI 
i 

>SfcI 



860 



870 



880 
* 



>HindIII 



890 
t * 



>KpnI 

>3anl 

i 

>Acc65I 



900 I I 910 



* { * * 

AAT T AATACGA CTCACTATAG GGAGACCCAA GCTTCGCGCG GGTACCACTC 
Asn Xxx> 
f > 



>Pf 1MI 



>EarI 



920 



930 



>PvuII 

I 

940 



>BanII 
I 

950 



960 



970 



* j * * * * j * 
TCTTCCGCAT CGCTGTCTGC GAGGGCCAGC TGTTGGGCTC GCGGTTGAGG ACAAACTCTT 
g TRIPARTITE LEADER SEQUENCE g > 



>EarI 



980 



>ScaI 
I 

| 990 



1000 



1010 

■k * 



1020 



1030 



CGCGGTCTTT CCAGTACTCT TGGATCGGAA ACCCGTCGGC CTCCGAACGG TACTCCGCCA 
g TRIPARTITE LEADER SEQUENCE g > 

>XhoI 
I 

>AvaI 
1 

>BsiEI >BsoBI 

I I 
>BsaWI >PaeR7I 

! I i 

I 1080 



>EcoO109I 
I 

> PpuMI 



i 1040 



1050 



1070 



1090 



1060 

, * I I* 

CCGAGGGACC TGAGCGAGTC CGCATCGACC GGATCGGAAA ACCTCTCGAG GAACTGAAAA 
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TRIPARTITE LEADER SEQUENCE g > 



>HpaI >PpuMI 

i I 
>HincII >Eco0109I 

! I 
1100 | 1110 1120 1130 1140 1150 

**!** * * * * * [ * * * 

ACCAGAAAGT TAACTGGTAA GTTTAGTCTT TTTGTCTTTT TATTTCAGGT CCCGGATCTG 
b HYBRID SV4 0 LATE 1NTRON b b > 

>Ppul0l 
I 

>21 bp tandem repeat III [ 110] ,[ 102] ,[ 112] I 

1160 1170 1180 11190 1200 1210 

★ * * * * * * | * * * * |* 

AGTTAGGGCG GGACATGGGC GGAGTTAGGG GCGGGACTAT GGTTGCTGAC TAATTGAGAT 

< h h_EARLY MRNA h 

>SphI 
I 

>NsiI 

i <72_bp_tandem_repeat__enhancer_sequence_ 

1 ~ I 

! 1220 1230 1240 1250 1260 I 1270 

* * * * * * * * **|** 

GCATGCTTTG CATACTTCTG CCTGCTGGGG AGCCTGGGGA CTTTCCACAC CTGGTTGCTG 
< h h EARLY MRNA h h 

>NsiI 
I 

>Ppul0I |>SphI 

1280 t ! 1290 1300 1310 1320 1330 

* * * * * * * * 

ACTAATTGAG ATGCATGCTT TGCATACTTC TGCCTGCTGG GGAGCCTGGG GACTTTCCAC 
< h h EARLY MRNA h h 

>PvuII >BsaWI >BseRI 

i I i 
< 72 bp_tandem_repeat_enhancer_sequence_B_ 

<T_ant igen_binding_site_II_ ! 

1 111 I 

| 1340 1350 I 1360 11370 1380 1390 

ACCCTAACTG ACACACATTC CACAGCTGGT TCTTTCAGAT CCGGTGGTGG TGCAAATCAA 

HYBRID SV40 > 

< h EARLY MRNA h 

" MINOR LATE 19S > 



>StuI 
I 

1400 1410 1420 1430 1440 1450 

** ** ** * j * ** ** 

AGAACTGCTC CTCAGTGGAT GTTGCCTTTA CTTCTAGGCC TGTACGGAAG TGTTACTTCT 
c HYBRID SV4 0 LATE INTRON c > 
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>BsiEI 

>NotI 
I 

>EaeI 
I 

>SacII 
I 

>EagI 



>XbaI 

! 

>PstI 



>EcoRI 



>SfcI 



! 



>AdoI >BsiWI 

i I 
1500 | 1510 



1460 1470 1480 I I 1490 

* * * ★ * i * t i * i * 

GCTCTAAAAG CTGCGGAATT GTACCCGCGG CCGCTGCAGT CTAGACGAAT TCGCGTACGA 
HYBRID SV4 0 LATE INT > 



, >ApaI 
I 

>BspDI >BanII 

I i 
>ClaI >Eco0109I 



>SacI 
i 

>BsiHKAI 
1 

>BanII 



>EcoRV 



>Bspl20I 



1520 
* t i * 



>SfcI 



1530 



>MslI >Ecll36II 

>BGH_POLY_A 

I 

1550 



1540 



>BclI 
I 
I 
I 

1560 
*■ j * 



TATCGATGGG CCCTATT CTA TAG TGT CAC CTA AAT GCTAG AGCTCGCTGA 

Leu *** Cys His Leu Asn> 
d SP6 PROMOTER d > 



1570 1580 1590 1600 1610 1620 

** * * * * + * * * * * 

TCAGCCTCGA CTGTGCCTTC TAGTTGCCAG CCATCTGTTG TTTGCCCCTC CCCCGTGCCT 

>BanI 
! 

1630 j 1640 1650 1660 1670 1680 

* j * * * * * * * * * 

TCCTTGACCC TGGAAGGTGC CACTCCCACT GTCCTTTCCT AATAAAATGA GGAAATTGCA 



1690 
* ★ 



1700 



1710 



1720 



1730 1740 

*■ * * it 

TCGCATTGTC TGAGTAGGTG TCATTCTATT CTGGGGGGTG GGGTGGGGCA GGACAGCAAG 

>BspMI 
I 

>BbsI >BssSI 

i I 
1750 1760 1770 1780 1790 1800 

* * ie * * * * * * * * 

GGGGAGGATT GGGAAGACAA TAGCCGAAAT GACCGACCAA GCGACGCCCA ACCTGCCATC 

1810 1820 1830 1840 1850 1860 

** * ★ * * * ★ * * * * 

ACGAGATTTC GATTCCACCG CCGCCTTCTA TGAAAGGTTG GGCTTCGGAA TCGTTTTCCG 
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>NaeI 

I 

>BpmI 

I i 

>BsrFI 

i I 
NgoMI 

i i 

i 1870 



1880 1890 1900 1910 

* * * * * + * yr 



1920 



GGACGCCGGC TGGATGATCC TCCAGCGCGG GGATCTCATG CTGGAGTTCT TCGCCCACCC 

>BpmI >ApoI 
I 

>SV4 0_early___poly_A 
i 

1 1930 1940 1950 1960 1970 

I * * * * * * * * * * 

CAACTTGTTT ATTGCAGCTT ATAATGGTTA CAAATAAAGC AATAGCATCA CAAATTTCAC 

>BsmI 
! 

1990 2000 ! 2010 2020 2030 2040 

* * * * j * * ** * * * ★ 

AAATAAAGCA TTTTTTTCAC TGCATTCTAG TTGTGGTTTG TCCAAACTCA TCAATGTATC 



1980 
★ * 



>Bstll07I 

I 

>AccI 
i 1 

2050 ! 1 



>KincII 
! 

>AccI 



>SalI 
I i i 

2060 i 2070 2080 2090 2100 

* * I | * | * | * * * * * * * * 

TTATCATGTC TGTATACCGT CGACCTCTAG CTAGAGCTTG GCGTAATCAT GGTCATAGCT 

PUC19 BACKBONE > 



2110 



2120 



>BsrBI 
I 

i 2130 



2140 



2150 



2160 



GTTTCCTGTG TGAAATTGTT ATCCGCTCAC AATTCCACAC AACATACGAG CCGGAAGCAT 
e e PUC19 BACKBONE e e > 



>BanI 
I 

2180 



>AseI 



2220 



2170 2180 2190 2200 i 2210 

* * * | * * * * * j* * 

AAAGTGTAAA GCCTGGGGTG CCTAATGAGT GAGCTAACTC ACATTAATTG CGTTGCGCTC 
e e PUC19 BACKBONE e e > 



2230 



2240 



2250 



>AseI >EaeI 
! 1 
| 2270 [ 2280 



>PvuII 
I 

2260 

ACTGCCCGCT TTCCAGTCGG GAAACCTGTC GTGCCAGCTG CATTAATGAA TCGGCCAACG 
e e PUC19 BACKBONE e e > 
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>SapI 
I 

>HaeII >EarI 

I ! 

2290 2300 2310 i 2320 2330 2340 

* * * * **|*|* * * * * 

CGCGGGGAGA GGCGGTTTGC GTATTGGGCG CTCTTCCGCT TCCTCGCTCA CTGACTCGCT 
e e PUC19 BACKBONE e e > 

>BsiEI >BsrBI 

I I 
2350 2360 |2370 2380 2390 2400 

*j* * * * | * * * * * * * 

GCGCTCGGTC GTTCGGCTGC GGCGAGCGGT ATCAGCTCAC TCAAAGGCGG TAATACGGTT 
e e PUC19 BACKBONE e e > 

>Af 

i 

2410 2420 2430 I 2440 2450 2460 

** * * * * | * * * * * * 

ATCCACAGAA TCAGGGGATA ACGCAGGAAA GAACATGTGA GCAAAAGGCC AGCAAAAGGC 
e e PUC19 BACKBONE e e > 

2470 2480 2490 2500 2510 2520 

* * * * * * * * * * * * 

CAGGAACCGT AAAAAGGCCG CGTTGCTGGC GTTTTTCCAT AGGCTCCGCC CCCCTGACGA 
e e PUC19 BACKBONE e e > 

>DrdI 

2530 2540 i 2550 2560 2570 2580 

** * * j * * * * * ■* * * 

GCATCACAAA AATCGACGCT CAAGTCAGAG GTGGCGAAAC CCGACAGGAC TATAAAGATA 
e e PUC19 BACKBONE e e > 

>BssSI >BsaWI 

I I 
2590 2600 [2610 2620 2630 2640 

* * * ★ * | * * * * * *j* 

CCAGGCGTTT CCCCCTGGAA GCTCCCTCGT GCGCTCTCCT GTTCCGACCC TGCCGCTTAC 
- e e PUC19 BACKBONE e e > 

>HaeII >SfcI 

I I 

2650 2660 2670 2680 | 2690 2700 

* * * * ** ★ * I * * * | * 

CGGATACCTG TCCGCCTTTC TCCCTTCGGG AAGCGTGGCG CTTTCTCAAT GCTCACGCTG 
e e PUC19 BACKBONE e e > 

> BsiHKAI 

>ApaLI 

I i 

2710 2720 2730 2740 2750 i 2760 

** ** * * * * * | * j * * 

TAGGTATCTC AGTTCGGTGT AGGTCGTTCG CTCCAAGCTG GGCTGTGTGC ACGAACCCCC 
e e PUC19 BACKBONE e e _> 
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>BsiEI >BsaWI 

I I 

2770 ! 2780 ! 2790 2800 2810 2820 

* * I * * *j* * * *■* * * 

CGTTCAGCCC GACCGCTGCG CCTTATCCGG TAACTATCGT CTTGAGTCCA ACCCGGTAAG 
. e e PUC19 BACKBONE e e > 

>AlwNI 
I 

2830 2840 2850 2860 2870 2880 

* * * * * | * * * * * * * 

ACACGACTTA TCGCCACTGG CAGCAGCCAC TGGTAACAGG ATTAGCAGAG CGAGGTATGT 
e e PUC19 BACKBONE e e > 

>SfcI 

2890 2900 2910 2920 2930 2940 

*l* * ★ * * * * * * * * 

AGGCGGTGCT ACAGAGTTCT TGAAGTGGTG GCCTAACTAC GGCTACACTA GAAGGACAGT 
e e PUC19 BACKBONE e e > 

>Eco57I 
i 

2950 2960 2970 2980 | 2990 3000 

** * * * * * * 

ATTTGGTATC TGCGCTCTGC TGAAGCCAGT TACCTTCGGA AAAAGAGTTG GTAGCTCTTG 
e e PUC19 BACKBONE e e > 

3010 3020 3030 3040 3050 3060 

* * * * * * ★ * * * * 

ATCCGGCAAA CAAACCACCG CTGGTAGCGG TGGTTTTTTT GTTTGCAAGC AGCAGATTAC 
e e PUC19 BACKBONE e e > 

3070 3080 3090 3100 3110 3120 

* * * * * * * * ★ * * * 

GCGCAGAAAA AAAGGATCTC AAGAAGATCC TTTGATCTTT TCTACGGGGT CTGACGCTCA 
e e PUC19 BACKBONE e e > 

>BspHI 

r 

3130 3140 3150 \ 3160 3170 3180 

* * ** ** * * 

GTGGAACGAA AACTCACGTT AAGGGATTTT GGTCATGAGA TTATCAAAAA GGATCTTCAC 
e e PUC19 BACKBONE e e > 

>DraI >DraI 
1 ! 

3190 | 3200 3210 I 3220 3230 3240 

* * I * * **}** ** ** 

CTAGATCCTT TTAAATTAAA AATGAAGTTT TAAATCAATC TAAAGTATAT ATGAGTAAAC 
e e PUC19 BACKBONE e e > 

>BanI 
I 

3250 3260 3270 I 3280 3290 3300 

** ** * * j * * ** * * 

TTGGTCTGAC AGTTACCAAT GCTTAATCAG TGAGGCACCT ATCTCAGCGA TCTGTCTATT 

a a AMP-ORF_ a a > 

e e PUC19 BACKBONE e e > 
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>AhdI 
I 

3310 3320 13330 3340 3350 3360 

* * * * * j * * * * * * -*■ 

TCGTTCATCC ATAGTTGCCT GACTCCCCGT CGTGTAGATA ACTACGATAC GGGAGGGCTT 

a a AMP-ORF a a > 

— e e PUC19 BACKBONE e e > 



>BsaI 
! 

>BsrDI >BpmI >BsrFI 

I ! ! 

3370 3380 3390 13400 13410 3420 

■k * * * * | * * j * nr j * * * 

ACCATCTGGC CCCAGTGCTG CAATGATACC GCGAGACCCA CGCTCACCGG CTCCAGATTT 

a a AMP-ORF a a > 

e e PUC19 BACKBONE e e > 



>BglI 
I 

3430 3440 | 3450 3460 3470 3480 

* * **■ *j* * * * * * * 

ATCAGCAATA AACCAGCCAG CCGGAAGGGC CGAGCGCAGA AGTGGTCCTG CAACTTTATC 

a a AMP-ORF a a > 

e e PUC19 BACKBONE e e > 



>AseI 
i 

3490 3500 3510 3520 3530 3540 

+ * *j* * * * * * * * * 

CGCCTCCATC CAGTCTATTA ATTGTTGCCG GGAAGCTAGA GTAAGTAGTT CGCCAGTTAA 

a a AMP-ORF a a > 

e e PUC19 BACKBONE e e > 



>Pspl406I 

>FspI I >BsrDI >SfcI >MslI 

IE i I i 

3550 I 3560 I 3570 3580 3590 3600 

* I * I * * | * j + *j* * * * * 

TAGTTTGCGC AACGTTGTTG CCATTGCTAC AGGCATCGTG GTGTCACGCT CGTCGTTXGG 

a a AM P -O RF a a > 

e e PUC19 BACKBONE e e > 



>BsaWI 

i 

3610 3620 3630 3640 3650 3660 

* * *j* * * * * * * ★ * 

TATGGCTTCA TTCAGCTCCG GTTCCCAACG ATCAAGGCGA GTTACATGAT CCCCCATGTT 

a a AMP-ORF a a > 

e e PUC19 BACKBONE e e > 



>BsiEI 
i 

>PvuI >EaeI 

I i 

3670 3680 3690 13700 3710 i 3720 

★ * ** * * * | * * ★ | * * 

GTGCAAAAAA GCGGTTAGCT CCTTCGGTCC TCCGATCGTT GTCAGAAGTA AGTTGGCCGC 

a a AMP-ORF_ a a > 

e e PUC19 BACKBONE e e > 



FIG. 8 

(CONTINUED) 
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>MslI 
t 

3730 3740 3750 3760 3770 3780 

* * * | * ★ * * * * * * 

AGTGTTATCA CTCATGGTTA TGGCAGCACT GCATAATTCT CTTACTGTCA TGCCATCCGT 

a a AMP-ORF a a > 

e e PUC19 BACKBONE e e > 



>ScaI 
I 

3790 3800 I 3810 3820 3830 3840 

*■* **• * j * * * * * 

AAGATGCTTT TCTGTGACTG GTGAGTACTC AACCAAGTCA TTCTGAGAAT AGTGTATGCG 

a a AMP-ORF a a > 

e e PUCI 9 BACKBONE e e > 



>BsiEI 



| 3850 3860 3870 3880 3890 3900 

* * ★ * * * * * 

GCGACCGAGT TGCTCTTGCC CGGCGTCAAT ACGGGATAAT ACCGCGCCAC ATAGCAGAAC 

a a AMP-ORF a a > 

e e PUCI 9 BACKBONE e e > 



>Pspl406I 
1 

>DraI >BsiHKAI >XmnI 
i I i 

| 3910 | 3920 I 3930 3940 3950 3960 

-** * * * * * * 

TTTAAAAGTG CTCATCATTG GAAAACGTTC TTCGGGGCGA AAACTCTCAA GGATCTTACC 

a a AMP-ORF a a - 

e e PUCI 9 BACKBONE e e : 

>Eco57I 

i 

>ApaLI 



>BssSI 

I 

3970 3980 3990 

* * * * * * 



>Bs iHKAI 
I 

4000 4010 4020 



GCTGTTGAGA TCCAGTTCGA TGTAACCCAC TCGTGCACCC AACTGATCTT CAGCAXCTTT 

a a AMP-ORF a a > 

e e PUCI 9 BACKBONE e e > 



4030 4040 4050 4060 4070 4080 

** * * * * ** * * * * 

TACTTTCACC AGCGTTTCTG GGTGAGCAAA AACAGGAAGG CAAAATGCCG CAAAAAAGGG 

^ a a AMP-ORF a a > 

~e e PUCI 9 BACKBONE e e > 



>MslI >EarI >SspI 

i i I 

4090 14100 4110 4120 j 4130 4140 

** *j* * * * * | * * ** 

AATAAGGGCG ACACGGAAAT GTTGAATACT CATACTCTTC CTTTTTCAAT ATTATTGAAG 

a AMP-ORF a > 

e e PUCI 9 BACKBONE e e > 



FIG. 8 

(CONTINUED) 



38/51 



>BspHI >BsrBI 

I I 

4150 4160 i 4170 4180 4190 4200 

★ * * * | * | * * * * * * * 

CATTTATCAG GGTTATTGTC TCATGAGCGG ATACATATTT GAATGTATTT AGAAAAATAA 
e e PUC19 BACKBONE e e > 



>HincII 

i 

>AccI 
I I 

>AatII 
i I 

>SalI 
i f I 

4210 4220 4230 4240 [ i I 

* * * * * * * * | | | 

ACAAATAGGG GTTCCGCGCA CATTTCCCCG AAAAGTGCCA CCTGACGTC 
e PUC19 BACKBONE e > 



FIG. 8 

(CONTINUED) 
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inventor (if only one name is listed below) or an original, first and joint inventor (if plural inventors are named below) of the subject 
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Application No. 08/801,092 and was amended on (if applicable). 

I have reviewed and understand the contents of the above identified specification, including the claims, as amended by any amendment 
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the like so made are punishable by fine or imprisonment, or both, under Section 1001 of Title 18 of the United States Code, and 
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SEQUENCE LISTING 



(1) GENERAL INFORMATION: 

(i) APPLICANT: Antelman, Douglas 

Gregory, Richard J. 
Wils, Kenneth N. 

(ii) TITLE OF INVENTION: Tissue Specific Expression of 
Retinoblastoma Protein 

(iii) NUMBER OF SEQUENCES : 46 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: TOWNS END and TOWNS END and CREW LLP 

(B) STREET: Two Embarcadero Center, 8th Floor 

(C) CITY: San Francisco 

(D) STATE: CA 

(E) COUNTRY: USA 

(F) ZIP : 94111 

(v) COMPUTER READABLE FORM : 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS/MS -DOS 

(D) SOFTWARE: Patentln Release #1.0, Version #1.30 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: US 08/801,092 

(B) FILING DATE: 14-FEB-1997 

(C) CLASSIFICATION: 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: US 08/751,517 

(B) FILING DATE: 15-NOV-1996 

(C) CLASSIFICATION: 

(viii) ATTORNEY /AGENT INFORMATION : 

(A) NAME: Fitts, Renee A. 

(B) REGISTRATION NUMBER: 35,136 

(C) REFERENCE / DOCKET NUMBER: 016 93 0-0010 2 0 

(ix) TELECOMMUNICATION INFORMATION: 
v (A) TELEPHONE: 415-576-0200 
(B) TELEFAX: 703-576-0300 



(2) INFORMATION FOR SEQ ID NO : 1 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 437 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : not relevant 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 1 : 
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Met Ala Leu Ala Gly Ala Pro Ala Gly Gly Pro Cys Ala Pro Ala Leu 
1 5 .10 15 

Glu Ala Leu Leu Gly Ala Gly Ala Leu Arg Leu Leu Asp Ser Ser Gin 
20 25 30 

lie Val lie lie Ser Ala Ala Gin Asp Ala Ser Ala Pro Pro Ala Pro 
35 40 45 

Thr Gly Pro Ala Ala Pro Ala Ala Gly Pro Cys Asp Pro Asp Leu Leu 
50 55 60 

Leu Phe Ala Thr Pro Gin Ala Pro Arg Pro Thr Pro Ser Ala Pro Arg 
65 70 75 80 

Pro Ala Leu Gly Arg Pro Pro Val Lys Arg Arg Leu Asp Leu Glu Thr 

85 90 95 

Asp His Gin Tyr Leu Ala Glu Ser Ser Gly Pro Ala Arg Gly Arg Gly 
100 105 110 

Arg His Pro Gly Lys Gly Val Lys Ser Pro Gly Glu Lys Ser Arg Tyr 
115 ^ 120 125 

Glu Thr Ser Leu Asn Leu Thr Thr Lys Arg Phe Leu Glu Leu Leu Ser 
130 135 140 

His Ser Ala Asp Gly Val Val Asp Leu Asn Trp Ala Ala Glu Val Leu 
145 150 155 160 

Lys Val Gin Lys Arg Arg lie Tyr Asp lie Thr Asn Val Leu Glu Gly 

165 170 175 

He Gin Leu He Ala Lys Lys Ser Lys Asn His He Gin Trp Leu Gly 
180 185 190 

Ser His Thr Thr Val Gly Val Gly Gly Arg Leu Glu Gly Leu Thr Gin 
195 200 205 

Asp Leu Arg Gin Leu Gin Glu Ser Glu Gin Gin Leu Asp His Leu Met 
210 215 220 

Asn He Cys Thr Thr Gin Leu Arg Leu Leu Ser Glu Asp Thr Asp Ser 
225 230 235 240 

Gin Arg Leu Ala Tyr Val Thr Cys Gin Asp Leu Arg Ser He Ala Asp 

245 250 255 

Pro Ala Glu Gin Met Val Met Val He Lys Ala Pro Pro Glu Thr Gin 
260 265 270 

Leu Gin Ala Val Asp Ser Ser Glu Asn Phe Gin He Ser Leu Lys Ser 
275 280 285 

Lys Gin Gly Pro He Asp Val Phe Leu Cys Pro Glu Glu Thr Val Gly 
290 295 300 

Gly He Ser Pro Gly Lys Thr Pro Ser Gin Glu Val Thr Ser Glu Glu 
305 310 315 320 

Glu Asn Arg Ala Thr Asp Ser Ala Thr He Val Ser Pro Pro Pro Ser 

325 330 335 
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Ser Pro Pro Ser Ser Leu Thr Thr Asp Pro Ser Gin Ser Leu Leu Ser 
340 345 350 

Leu Glu Gin Glu Pro Leu Leu Ser Arg Met Gly Ser Leu Arg Ala Pro 
355 360 355 

Val Asp Glu Asp Arg Leu Ser Pro Leu Val Ala Ala Asp Ser Leu Leu 
370 375 380 

Glu His Val Arg Glu Asp Phe Ser Gly Leu Leu Pro Glu Glu Phe lie 
385 390 395 400 

Ser Leu Ser Pro Pro His Glu Ala Leu Asp Tyr His Phe Gly Leu Glu 

405 410 415 

Glu Gly Glu Gly lie Arg Asp Leu Phe Asp Cys Asp Phe Gly Asp Leu 
420 425 430 

Thr Pro Leu Asp Phe 
435 

(2) INFORMATION FOR SEQ IB NO: 2: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 2517 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 2 : 

GGAATTCCGT GGCCGGGACT TTGCAGGCAG CGGCGGCCGG GGGCGGAGCG GGATCGAGCC 6 0 

CTCGCCGAGG CCTGCCGCCA TGGGCCCGCG CCGCCGCCGC CGCCTGTCAC CCGGGCCGCG 12 0 

CGGGCCGTGA GCGTCATGGC CTTGGCCGGG GCCCCTGCGG GCGGCCCATG CGCGCCGGCG 13 0 

CTGGAGGCCC TGCTCGGGGC CGGCGCGCTG CGGCTGCTCG ACTCCTCGCA GATCGTCATC 24 0 

ATCTCCGCCG CGCAGGACGC CAGCGCCCCG CCGGCTCCCA CCGGCCCCGC GGCGCCCGCC 3 00 

GCCGGCCCCT GCGACCCTGA CCTGCTGCTC TTCGCCACAC CGCAGGCGCC CCGGCCCACA 36 0 

CCCAGTGCGC CGCGGCCCGC GCTCGGCCGC CCGCCGGTGA AGCGGAGGCT GGACCTGGAA 42 0 

ACTGACCATC AGTACCTGGC CGAGAGCAGT GGGCCAGCTC GGGG CAGAGG CCGCCATCCA 480 

GGAAAAGGTG TGAAATCCCC GGGGGAGAAG TCACGCTATG AGACCTCACT GAATCTGACC 540 

AC C AAGCGCT TCCTGGAGCT GCTGAGCCAC TCGGCTGACG GTGTCGTCGA CCTGAACTGG 6 00 

GCTGCCGAGG TGCTGAAGGT GCAGAAGCGG CGCATCTATG AC AT C AC CAA CGTCCTTGAG 660 

GGCATCCAGC TCATTGCCAA GAAGTC CAAG AACCACATCC AGTGGCTGGG CAGCCACACC 72 0 

ACAGTGGGCG TCGGCGGACG GCTTGAGGGG TTGACCCAGG ACCTCCGACA GCTGCAGGAG 78 0 

AGCGAGCAGC AGCTGGACCA CCTGATGAAT ATCTGTACTA CGCAGCTGCG CCTGCTCTCC 84 0 
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GAGGACACTG 


ACAGCCAGCG 


CCTGGCCTAC 


GTGACGTGTC 


AGGACCTTCG 


TAGCATTGCA 


900 


GACCCTGCAG 


AGCAGATGGT 


TATGGTGATC 


AAAGCCCCTC 


CTGAGACCCA 


GCTCCAAGCC 


960 


GTGGACTCTT 


CGGAGAACTT 


TCAGATCTCC 


CTTAAGAGCA 


AACAAGGCCC 


GATCGATGTT 


1020 


TTCCTGTGCC 


CTGAGGAGAC 


CGTAGGTGGG 


ATCAGCCCTG 


GGAAGACCCC 


ATC CCAGGAG 


1080 


GTCACTTCTG 


AGGAGGAGAA 


CAGGGC C ACT 


GACTCTGCCA 


CCATAGTGTC 


ACCACCACCA 


1140 


TCATGTCCCC 


CCTCATCCCT 


CACCACAGAT 


CCCAGCCAGT 


CTCTACTCAG 


CCTGGAGCAA 


1200 


GAACCGCTGT 


TGTCCCGGAT 


GGGCAGCCTG 


CGGGCTCCCG 


TGGACGAGGA 


CCGCCTGTCC 


1260 


CCGCTGGTGG 


CGGCCGACTC 


GCTCCTGGAG 


CATGTGCGGG 


AGGACTTCTC 


CGGCCTCCTC 


1320 


CCTGAGGAGT 


TCATCAGCCT 


TTCCCCACCC 


CACGAGGCCC 


TCGACTACCA 


CTTCGGCCTC 


1380 


GAGGAGGGCG 


AGGGCATCAG 


AGACCTCTTC 


GACTGTGACT 


TTGGGGACCT 


CACCCCCCTG 


1440 


GATTTCTGAC 


AGGGCTTGGA 


GGGACCAGGG 


TTTCCAGAGT 


AGCTCACCTT 


GTCTCTGCAG 


1500 


U CCCTGGAGCC 


CCCTGTCCCT 


GGCCGTCCTC 


CCAGCCTGTT 


TGGAAACATT 


TAATTTATAC 


1560 


|y CCCTCTCCTC 


TGTCTCCAGA 


AGCTTCTAGC 


TCTGGGGTCT 


GGCTACCGCT 


AGGAGGCTGA 


1620 


GCAAGCCAGG 


AAGGGAAGGA 


GTCTGTGTGG 


TGTGTATGTG 


CATGCAGCCT 


ACACCCACAC 


1680 


Q-pGTGTACCG 


GGGGTGAATG 


TGTGTGAGCA 


TGTGTGTGTG 


CATGTACCGG 


GGAATGAAGG 


1740 


m TGAACATACA 


C CTCTGTGTG 


TGCACTGCAG 


ACACGCCCCA 


GTGTGTCCAC 


ATGTGTGTGC 


1800 


:„ ATGAGTC CAT 


CTCTGCGCGT 


GGGGGGGCTC 


TAACTGCACT 


TTCGGCCCTT 


TTGCTCGTGG 


1860 


§jl GGTCCCACAA 


GGCCCAGGGC 


AGTGCCTGCT 


CCCAGAATCT 


GGTGCTCTGA 


CCAGGCCAGG 


1920 


? I TGGGGAGGCT 


TTGGCTGGCT 


GGGCGTGTAG 


GACGGTGAGA 


GCACTTCTGT 


CTTAAAGGTT 


1980 


\?2 TTTTCTGATT 


GAAGCTTTAA 


TGGAGCGTTA 


TTTATTTATC 


GAGGCCTCTT 


TGGTGAGCCT 


2040 


= ~ GGGGAATCAG 


CAAAAGGGGA 


GGAGGGGTGT 


GGGGTTGATA 


CCCCAACTCC 


CTCTACCCTT 


2100 


GAGCAAGGGC 


AGGGGTCCCT 


GAGCTGTTCT 


TCTGCCCCAT 


ACTGAAGGAA 


CTGAGGCCTG 


2160 


GGTGATTTAT 


TTATTGGGAA 


AGTGAGGGAG 


GGAGACAGAC 


TGACTGACAG 


CCATGGGTGG 


2220 


TCAGATGGTG 


GGGTGGGCCC 


TCTCCAGGGG 


GCCAGTTCAG 


GGCCCAGCTG 


CCCCCCAGGA 


2280 


TGGATATGAG 


ATGGGAGAGG 


TGAGTGGGGG 


ACCTTCACTG 


ATGTGGGCAG 


GAGGGGTGGT 


2340 


GAAGGCCTCC 


CCCAGCCCAG 


ACCCTGTGGT 


CCCTCCTGCA 


GTGTCTGAAG 


CGCCTGCCTC 


2400 


CCCACTGCTC 


TGCCCCACCC 


TCCAATCTGC 


ACTTTGATTT 


GCTTCCTAAC 


AGCTCTGTTC 


2460 


CCTCCTGCTT 


TGGTTTTAAT 


AAATATTTTG 


ATGACGTTAA 


AAAAAGGAAT 


TCGATAT 


2517 



(2) INFORMATION FOR SEQ ID NO : 3 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 994 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 



t 
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(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 3 : 
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TGGATCACAT 
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AATTTTATAC 


CTTTTATGAA 


TTCTCTTGGA 


CTTGTAACAT 


CTAATGGACT 


TCCAGAGGTT 


1080 


GAAAATCTTT 


CTAAACGATA 


CGAAGAAATT 


TATCTTAAAA 


ATAAAGATCT 


AGATGCAAGA 


1140 


TTATTTTTGG 


ATCATGATAA 


AACTCTTCAG 


ACTGATTCTA 


TAGACAGTTT 


TGAAACACAG 


1200 


AGAACACCAC 


GAAAAAGTAA 


CCTTGATGAA 


GAGGTGAATG 


TAATTCCTCC 


ACACACTCCA 


1260 


GTTAGGACTG 


TTATGAACAC 


TATC CAACAA 


TTAATGATGA 


TTTTAAATTC 


AGCAAGTGAT 


1320 


CAACCTTCAG 


AAAATCTGAT 


TTCCTATTTT 


AACAACTGCA 


CAGTGAATCC 


AAAAGAAAGT 


1380 


ATACTGAAAA 


GAGTGAAGGA 


TATAGGATAC 


ATCTTTAAAG 


AGAAATTTGC 


TAAAGCTGTG 


1440 


GGACAGGGTT 


GTGTCGAAAT 


TGGATCACAG 


CGATACAAAC 


TTGGAGTTCG 


CTTGTATTAC 


1500 


CGAGTAATGG 


AATCCATGCT 


TAAATCAGAA 


GAAGAACGAT 


TATC CATTCA 


AAATTTTAGC 


1560 


AAACTTCTGA 


ATGACAACAT 


TTTTCATATG 


TCTTTATTGG 


CGTGCGCTCT 


TGAGGTTGTA 


1620 
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ATGGCCACAT ATAGCAGAAG TACATCTCAG AATCTTGATT CTGGAACAGA TTTGTCTTTC 168 0 

CCATGGATTC TGAATGTGCT TAATTTAAAA GCCTTTGATT TTTACAAAGT GATCGAAAGT 174 0 

TTTATCAAAG CAGAAGGCAA CTTGACAAGA GAAATGATAA AACATTTAGA ACGATGTGAA 18 0 0 

CATCGAATCA TGGAATCCCT TGCATGGCTC TCAGATTCAC CTTTATTTGA TCTTATTAAA 18 6 0 

CAATCAAAGG ACCGAGAAGG ACCAACTGAT CACCTTGAAT CTGCTTGTCC TCTTAATCTT 192 0 

CCTCTCCAGA ATAATCACAC TGCAGCAGAT ATGTATCTTT CTCCTGTAAG ATCTCCAAAG 198 0 

AAAAAAGGTT CAACTACGCG TGTAAATTCT ACTGCAAATG CAGAGACACA AGCAAC CTC A 2 04 0 

GCCTTCCAGA CCCAGAAGCC ATTGAAATCT ACCTCTCTTT CACTGTTTTA TAAAAAAGTG 210 0 

TATCGGCTAG CCTATCTCCG GCTAAATACA CTTTGTGAAC GCCTTCTGTC TGAGCACCCA 216 0 

GAATTAGAAC AT AT CATCTG GACCCTTTTC CAGCACACCC TGCAGAATGA GTATGAACTC 222 0 

ATGAGAGACA GGCATTTGGA CCAAATTATG ATGTGTTCCA TGTATGGCAT ATGCAAAGTG 22 8 0 

AAGAATATAG ACCTTAAATT CAAAATCATT GTAACAGCAT ACAAGGATCT TCCTCATGCT 2 34 0 

GTTCAGGAGA CATTCAAACG TGTTTTGATC AAAGAAGAGG AGTATGATTC TATTATAGTA 24 00 

TTCTATAACT CGGTCTTCAT GCAGAGACTG AAAACAAATA TTTTGCAGTA TGCTTCCACC 246 0 

AGGCCCCCTA CCTTGTCACC AATACCTCAC ATTCCTCGAA GCCCTTACAA GTTTCCTAGT 252 0 

TCACCCTTAC GGATTCCTGG AGGGAACATC TATATTTCAC CCCTGAAGAG TCCATATAAA 25 8 0 

ATTTCAGAAG GTCTGCCAAC ACCAACAAAA ATGACTCCAA GATCAAGAAT CTTAGTATCA 264 0 

ATTGGTGAAT CATTCGGGAC TTCTGAGAAG TTCCAGAAAA TAAATCAGAT GGTATGTAAC 2 7 00 

AGCGACCGTG TGCTCAAAAG AAGTGCTGAA GGAAGCAACC CTCCTAAACC ACTGAAAAAA 2 76 0 

CTACGCTTTG ATATTGAAGG ATCAGATGAA GCAGATGGAA GTAAACATCT C C C AGG AG AG 282 0 

TCCAAATTTC AGCAGAAACT GGCAGAAATG ACTTCTACTC GAACACGAAT GCAAAAGCAG 2 88 0 

AAAATGAATG ATAGCATGGA TACCTCAAAC AAGGAAGAGA AATGAGGATC TCAGGACCTT 2 94 0 

GGTGGACACT GTGTACACCT CTGGATTCAT TGTCTCTCAC AGATGTGACT GTAT 2 9 94 
(2) INFORMATION FOR SEQ ID NO : 4 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 92 8 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : not relevant 

(D) TOPOLOGY: not relevant 

Cii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 4 : 

Met Pro Pro Lys Thr Pro Arg Lys Thr Ala Ala Thr Ala Ala Ala Ala 
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Ala Ala Glu Pro 
20 



Pro Glu Gin Asp 
35 

Phe Glu Glu Thr 
50 

Lys lie Pro Asp 
65 

Val Ser Ser Val 



Glu Leu Trp Gly 
100 

Met Ser Phe Thr 
115 

His Lys Phe Phe 
130 

Asp Asn Ala Met 
145 

Leu Phe Ser Lys 



Pro Ser Ser Ser 
180 

Val Ser Trp lie 
195 

Glu Asp Asp Leu 
210 

Tyr Phe He Lys 
225 

Thr Ala Val He 



Gin Asn Arg Ser 
260 

He He Glu Val 
275 

Lys Asn Val Tyr 
290 

Leu Val Thr Ser 
305 

Tyr Glu Glu He 



5 

Pro Ala Pro Pro 



Ser Gly Pro Glu 
40 

Glu Glu Pro Asp 
55 

His Val Arg Glu 
70 

Asp Gly Val Leu 
85 

He Cys He Phe 



Phe Thr Glu Leu 
120 



Asn Leu Leu Lys 
135 

Ser Arg Leu Leu 
150 

Leu Glu Arg Thr 
165 

He Ser Thr Glu 



Thr Phe Leu Leu 
200 

Val He Ser Phe 
215 

Leu Ser Pro Pro 
230 

Pro He Asn Gly 
245 

Ala Arg He Ala 



Leu Cys Lys Glu 
280 

Phe Lys Asn Phe 
295 

Asn Gly Leu Pro 
310 

Tyr Leu Lys Asn 
325 



10 

Pro Pro Pro Pro 
25 

Asp Leu Pro Leu 



Phe Thr Ala Leu 

60 

Arg Ala Trp Leu 
75 

Gly Gly Tyr He 
90 

He Ala Ala Val 
105 

Gin Lys Asn He 



Glu He Asp Thr 
140 



Lys Lys Tyr Asp 
155 

Cys Glu Leu He 
170 

He Asn Ser Ala 
185 

Ala Lys Gly Glu 



Gin Leu Met Leu 
220 

Met Leu Leu Lys 
235 

Ser Pro Arg Thr 
250 

Lys Gin Leu Glu 
265 

His Glu Cys Asn 



He Pro Phe Met 
300 

Glu Val Glu Asn 
315 

Lys Asp Leu Asp 
330 



Pro Glu Glu Asp 
30 

Val Arg Leu Glu 
45 

Cys Gin Lys Leu 



Thr Trp Glu Lys 
80 

Gin Lys Lys Lys 
95 

Asp Leu Asp Glu 
110 

Glu He Ser Val 
125 

Ser Thr Lys Val 



Val Leu Phe Ala 
160 

Tyr Leu Thr Gin 
175 

Leu Val Leu Lys 
190 

Val Leu Gin Met 
205 

Cys Val Leu Asp 



Glu Pro Tyr Lys 
240 

Pro Arg Arg Gly 
255 

Asn Asp Thr Arg 
270 

He Asp Glu Val 
285 

Asn Ser Leu Gly 



Leu Ser Lys Arg 
320 

Ala Arg Leu Phe 
335 



Leu Asp Kis Asp 
340 

Thr Gin Arg Thr 
355 

lie Pro Pro His 
370 

Leu Met Met He 
385 

He Ser Tyr Phe 



Lys Arg Val Lys 
420 

Ala Val Gly Gin 
435 

Gly Val Arg Leu 
450 

Glu Glu Arg Leu 
465 

He Phe His Met 



Thr Tyr Ser Arg 
500 

Ser Phe Pro Trp 
515 

Tyr Lys Val He 
530 

Glu Met He Lys 
545 

Leu Ala Trp Leu 



Lys Asp Arg Glu 
580 

Asn Leu Pro Leu 
595 

Pro Val Arg Ser 
610 

Thr Ala Asn Ala 
625 

Pro Leu Lys Ser 



Leu Ala Tyr Leu 
660 



Lys Thr Leu Gin 



Pro Arg Lys Ser 
360 

Thr Pro Val Arg 
375 

Leu Asn Ser Ala 
390 

Asn Asn Cys Thr 
405 

Asp He Gly Tyr 



Gly Cys Val Glu 
440 

Tyr Tyr Arg Val 
455 

Ser He Gin Asn 
470 

Ser Leu Leu Ala 
485 

Ser Thr Ser Gin 



He Leu Asn Val 
520 

Glu Ser Phe He 
535 

His Leu Glu Arg 
550 

Ser Asp Ser Pro 
565 

Gly Pro Thr Asp 



Gin Asn Asn His 
600 

Pro Lys Lys Lys 
615 

Glu Thr Gin Ala 
630 

Thr Ser Leu Ser 
645 

Arg Leu Asn Thr 
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Thr Asp Ser He 
345 

Asn Leu Asp Glu 



Thr Val Met Asn 
380 

Ser Asp Gin Pro 
395 

Val Asn Pro Lys 
410 

He Phe Lys Glu 
425 

He Gly Ser Gin 



Met Glu Ser Met 
460 

Phe Ser Lys Leu 
475 

Cys Ala Leu Glu 
490 

Asn Leu Asp Ser 
505 

Leu Asn Leu Lys 



Lys Ala Glu Gly 
540 

Cys Glu His Arg 
555 

Leu Phe Asp Leu 
570 

His Leu Glu Ser 
585 

Thr Ala Ala Asp 



Gly Ser Thr Thr 
620 

Thr Ser Ala Phe 
635 

Leu Phe Tyr Lys 
650 

Leu Cys Glu Arg 
665 



Asp Ser Phe Glu 
350 

Glu Val Asn Val 
365 

Thr He Gin Gin 



Ser Glu Asn Leu 
400 

Glu Ser He Leu 
415 

Lys Phe Ala Lys 
430 

Arg Tyr Lys Leu 
445 

Leu Lys Ser Glu 



Leu Asn Asp Asn 
480 

Val Val Met Ala 
495 

Gly Thr Asp Leu 
510 

Ala Phe Asp Phe 
525 

Asn Leu Thr Arg 



He Met Glu Ser 
560 

He Lys Gin Ser 
575 

Ala Cys Pro Leu 
590 

Met Tyr Leu Ser 
605 

Arg Val Asn Ser 



Gin Thr Gin Lys 
640 

Lys Val Tyr Arg 
655 

Leu Leu Ser Glu 
670 
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Kis Pro Glu Leu Glu His lie lie Trp Thr Leu Phe Gin His Thr Leu 
675 630 635 

Gin Asn Glu Tyr Glu Leu Met Arg Asp Arg His Leu Asp Gin lie Met 
690 695 700 

Met Cys Ser Met Tyr Gly He Cys Lys Val Lys Asn He Asp Leu Lys 
705 710 715 720 

Phe Lys He He Val Thr Ala Tyr Lys Asp Leu Pro His Ala Val Gin 

725 730 735 

Glu Thr Phe Lys Arg Val Leu He Lys Glu Glu Glu Tyr Asp Ser He 
740 745 750 

He Val Phe Tyr Asn Ser Val Phe Met Gin Arg Leu Lys Thr Asn He 
755 760 765 

Leu Gin Tyr Ala Ser Thr Arg Pro Pro Thr Leu Ser Pro He Pro His 
770 775 780 

He Pro Arq Ser Pro Tyr Lys Phe Pro Ser Ser Pro Leu Arg He Pro 
785 790 795 800 

Gly Gly Asn He Tyr He Ser Pro Leu Lys Ser Pro Tyr Lys He Ser 

805 810 815 

Glu Gly Leu Pro Thr Pro Thr Lys Met Thr Pro Arg Ser Arg He Leu 
820 825 830 

Val Ser He Gly Glu Ser Phe Gly Thr Ser Glu Lys Phe Gin Lys He 
835 840 845 

Asn Gin Met Val Cys Asn Ser Asp Arg Val Leu Lys Arg Ser Ala Glu 
850 855 860 

Gly Ser Asn Pro Pro Lys Pro Leu Lys Lys Leu Arg Phe Asp He Glu 
865 870 875 880 

Gly Ser Asp Glu Ala Asp Gly Ser Lys His Leu Pro Gly Glu Ser Lys 

885 890 895 

Phe Gin Gin Lys Leu Ala Glu Met Thr Ser Thr Arg Thr Arg Met Gin 
900 905 910 

Lys Gin Lys Met Asn Asp Ser Met Asp Thr Ser Asn Lys Glu Glu Lys 
915 920 925 

(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 38 53 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

<ii) MOLECULE TYPE: DNA (genomic) 



(ix) FEATURE : 

(A) NAME / KEY : CDS 

<B) LOCATION: 209., 250 
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(ix) FEATURE: 

(A) NAME / KEY : CDS 

(B) LOCATION; 254.. 289 

(ix) FEATURE: 

(A) NAME /KEY: CDS 

£B) LOCATION: 293.. 505 

(ix) FEATURE : 

(A) NAME / KEY : CDS 

(B) LOCATION: 509.. 514 

(ix) FEATURE: 

(A) NAME /KEY : CDS 

(B) LOCATION: 518.. 520 

(ix) FEATURE: 

(A) NAME / KEY : CDS 

(B) LOCATION: 524.. 658 

(ix) FEATURE: 

(A) NAME /KEY : CDS 

(B) LOCATION: 662.. 691 

(ix) FEATURE: 

(A) NAME /KEY : CDS 

(B) LOCATION: 695 . . 748 

(ix) FEATURE: 

(A) NAME /KEY : CDS 

(B) LOCATION: 752.. 781 

(ix) FEATURE : 

(A) NAME / KEY : CDS 

(B) LOCATION: 785.. 829 

(ix) FEATURE: 

(A) NAME /KEY ; CDS 

(B) LOCATION: 1132.. 1134 

(ix) FEATURE: 

(A) NAME /KEY : CDS 

(B) LOCATION: 113 8.. 1149 

(ix) FEATURE: 

(A) NAME /KEY ; CDS 

(B) LOCATION: 833.. 862 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 5 : 

GACGGATCGG GAGATCTCCC GATCCCCTAT GGTCGACTCT CAGTACAATC TGCTCTGATG 60 

CCGCATAGTT AAGCCAGTAT CTGCTCCCTG CTTGTGTGTT GGAGGTCGCT GAGTAGTGCG 120 

CGAGCAAAAT TTAAGCTACA ACAAGGCAAG GCTTGACCGA CAATTGCATG AAGAATCTGC 180 

TTAGGGTTAG GCGTTTTGCG CTGCTTCG CGA TGT ACG GGC CAG ATA TAC GCG 2 32 

Arg Cys Thr Gly Gin lie Tyr Ala 
1 5 

TTG ACA TTG ATT ATT GAC TAG TTA TTA ATA GTA ATC AAT TAC GGG GTC 280 



1 
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Leu Thr Leu lie lie Asp Leu Leu He Val He Asn Tyr Gly Val 

10 1 5 

ATT AGT TCA TAG CCC ATA TAT GGA GTT CCG CGT TAC ATA ACT TAC GGT 32 8 

He Ser Ser Pro He Tyr Gly Val Pro Arg Tyr He Thr Tyr Gly 

10 1 5 10 

AAA TGG CCC GCC TGG CTG ACC GCC CAA CGA CCC CCG CCC ATT GAC GTC 3 76 

Lvs Trp Pro Ala Trp Leu Thr Ala Gin Arg Pro Pro Pro He Asp Val 
15 20 25 

AAT AAT GAC GTA TGT TCC CAT AGT AAC GCC AAT AGG GAC TTT CCA TTG 424 
Asn Asn Asp Val Cys Ser His Ser Asn Ala Asn Arg Asp Phe Pro Leu 
30 35 40 

ACG TCA ATG GGT GGA CTA TTT ACG GTA AAC TGC CCA CTT GGC AGT ACA 472 
Thr Ser Met Gly Gly Leu Phe Thr Val Asn Cys Pro Leu Gly Ser Thr 
45 50 55 60 

TCA AGT GTA TCA TAT GCC AAG TAC GCC CCC TAT TGA CGT CAA 514 
Ser Ser Val Ser Tyr Ala Lys Tyr Ala Pro Tyr Arg Gin 

. 65 70 1 

TGA CGG TAA ATG GCC CGC CTG GCA TTA TGC CCA GTA CAT GAC CTT ATG 562 
Ara Met Ala Arg Leu Ala Leu Cys Pro Val His Asp Leu Met 

11 5 10 

GGA CTT TCC TAC TTG GCA GTA CAT CTA CGT ATT AGT CAT CGC TAT TAC 610 
Glv Leu Ser Tyr Leu Ala Val His Leu Arg He Ser His Arg Tyr Tyr 
15 20 25 

CAT GGT GAT GCG GTT TTG GCA GTA CAT CAA TGG GCG TGG ATA GCG GTT 658 
His Glv Asp Ala Val Leu Ala Val His Gin Trp Ala Trp He Ala Val 
30 35 40 45 

TGA CTC ACG GGG ATT TCC AAG TCT CCA CCC CAT TGA CGT CAA TGG GAG 70 6 

Leu Thr Gly He Ser Lys Ser Pro Pro His Arg Gin Trp Glu 

1 5 10 1 

TTT GTT TTG GCA CCA AAA TCA ACG GGA CTT TCC AAA ATG TCG 748 
Phe Val Leu Ala Pro Lys Ser Thr Gly Leu Ser Lys Met Ser 
5 10 15 

TAA CAA CTC CGC CCC ATT GAC GCA AAT GGG CGG TAG CGC TGT ACG GTG 7 96 

Gin Leu Arg Pro He Asp Ala Asn Gly Arg Arg Cys Thr Val 

1 5 10 1 

GGA GGT CTA TAT AAG CAG AGC TCT CTG GCT AAC TAG AGA ACC CAC TGC 844 
Gly Gly Leu Tyr Lys Gin Ser Ser Leu Ala Asn Arg Thr His Cys 

5 10 15 1 

TTA CTG GCT TAT CGA AAT TAATACGACT CACTATAGGG AGACCCAAGC 8 92 

Leu Leu Ala Tyr Arg Asn 
5 10 

TTCGCGCGGG TACCACTCTC TTCCGCATCG CTGTCTGCGA GGGCCAGCTG TTGGGCTCGC 952 
GGTTGAGGAC AAACTCTTCG CGGTCTTTCC AGTACTCTTG GATCGGAAAC CCGTCGGCCT 1012 
CCGAACGGTA CTCCGCCACC GAGGGACCTG AGCGAGTCCG CATCGACCGG AT CGGAAAAC 1072 



CTCTCGAGGC GGCCGCTGCA GTCTAGACGA ATTCGCGTAC GATATCGATG GGCCCTATT 



1131 



1 
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CTA TAG TGT CAC CTA AAT GCTAGAGCTC GCTGATCAGC CTCGACTGTG 117 9 
Leu Cys His Leu Asn 
1 1 

CCTTCTAGTT GCCAGCCATC TGTTGTTTGC CCCTCCCCCG TGCGTTCCTT GACCCTGGAA 123 9 

GGTGCCACTC CCACTGTCGT TTCCTAATAA AATGAGGAAA TTGCATCGCA TTGTCTGAGT 12 99 

AGGTGTCATT CTATTCTGGG GGGTGGGGTG GGGCAGGACA GCAAGGGGGA GGATTGGGAA 13 59 

GACAATAGCC GAAATGACCG AC C AAGCGAC GCCCAACCTG CCATCACGAG ATTTCGATTC 1419 

CACCGCCGCC TTCTATGAAA GGTTGGGCTT CGGAATCGTT TTCCGGGACG CCGGCTGGAT 1479 

GATCCTCCAG CGCGGGGATC TCATGCTGGA GTTCTTCGCC CACCCCAACT TGTTTATTGC 153 9 

AGCTTATAAT GGTTACAAAT AAAGCAATAG CATCACAAAT TTCACAAATA AAGCATTTTT 15 99 

TTCACTGCAT TCTAGTTGTG GTTTGTCCAA ACTCATCAAT GTATCTTATC ATGTCTGTAT 1659 

ACCGTCGACC TCTAGCTAGA GCTTGGCGTA AT C ATGGTCA TAGCTGTTTC CTGTGTGAAA 1719 

TTGTTATCCG CTCACAATTC CACACAACAT ACGAGC CGGA AGCATAAAGT GTAAAGCCTG 1779 

GGGTGCCTAA TGAGTGAGCT AACTCACATT AATTGCGTTG CGCTCACTGC CCGCTTTCCA 183 9 

GTCGGGAAAC CTGTCGTGCC AGCTGCATTA ATGAATCGGC CAACGCGCGG GGAGAGGCGG 18 99 

TTTGCGTATT GGGCGCTCTT CCGCTTCCTC GCTCACTGAC TCGCTGCGCT CGGTCGTTCG 195 9 

GCTGCGGCGA GCGGTATCAG CTCACTCAAA GGCGGTAATA CGGTTATCCA CAGAATCAGG 2 019 

GGATAACGCA GGAAAGAACA TGTGAGCAAA AGGCCAGCAA AAGGCCAGGA ACCGTAAAAA 2 079 

GGCCGCGTTG CTGGCGTTTT TCCATAGGCT CCGCCCCCCT GACGAGCATC ACAAAAATCG 213 9 

ACGCTCAAGT CAGAGGTGGC GAAACC CG AC AGGACTATAA AGATACCAGG CGTTTCCCCC 2199 

TGGAAGCTCC CTCGTGCGCT CTCCTGTTCC GACCCTGCCG CTTACCGGAT ACCTGTCCGC 2259 

CTTTCTCCCT TCGGGAAGCG TGGCGCTTTC TCAATGCTCA CGCTGTAGGT ATCTCAGTTC 2319 

GGTGTAGGTC GTTCGCTCCA AGCTGGGCTG TGTGCACGAA CCCCCCGTTC AGCCCGACCG 23 79 

CTGCGCCTTA TCCGGTAACT ATCGTCTTGA GTCCAACCCG GTAAGACACG ACTTATCGCC 24 3 9 

ACTGGCAGCA GCCACTGGTA ACAGGATTAG CAGAGCGAGG TATGTAGGCG GTGCTACAGA 24 9 9 

GTTCTTGAAG TGGTGGCCTA ACTACGGCTA CACTAGAAGG ACAGTATTTG GTATCTGCGC 2 55 9 

TCTGCTGAAG CCAGTTACCT TCGGAAAAAG AGTTGGTAGC TCTTGATCCG GCAAACAAAC 2619 

CACCGCTGGT AGCGGTGGTT TTTTTGTTTG CAAGCAGCAG ATTACGCGCA GAAAAAAAGG 26 7 9 

ATCTCAAGAA GATCCTTTGA TCTTTTCTAC GGGGTCTGAC GCTCAGTGGA ACGAAAACTC 2 73 9 

ACGTTAAGGG ATTTTGGTCA TGAGATTATC AAAAAGGATC TT CAC C TAG A TCCTTTTAAA 2 7 99 

TTAAAAATGA AGTTTTAAAT CAATCTAAAG TATATATGAG TAAACTTGGT CTGACAGTTA 2859 

CCAATGCTTA ATCAGTGAGG CACCTATCTC AGCGATCTGT CTATTTCGTT CATCCATAGT 2919 
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TGCCTGACTC 


CCCGTCGTGT 


AGATAACTAC 


GATACGGGAG 


GGCTTACCAT 


CTGGCCCCAG 


2979 


TGCTGCAATG 


ATACCGCGAG 


ACCCACGCTC 


ACCGGCTCCA 


GATTTATCAG 


CAATAAACCA 


3039 


GCCAGCCGGA 


AGGGCCGAGC 


GCAGAAGTGG 


TCGTGCAACT 


TTATCCGCCT 


CCATCCAGTC 


3099 


TATTAATTGT 


TGCCGGGAAG 


CTAGAGTAAG 


TAGTTCGCCA 


GTTAATAGTT 


TGCGCAACGT 


3159 


TGTTGCCATT 


GCTACAGGCA 


TCGTGGTGTC 


ACGCTCGTCG 


TTTGGTATGG 


CTTCATTCAG 


3219 


CTCCGGTTCC 


CAACGATCAA 


GGCGAGTTAC 


ATGATCCCCC 


ATGTTGTGCA 


AAAAAGCGGT 


3279 


TAGCTCCTTC 


GGTCCTCCGA 


TCGTTGTCAG 


AAGTAAGTTG 


GCCGCAGTGT 


TATCACTCAT 


3339 


GGTTATGGCA 


GCACTGCATA 


ATTCTCTTAC 


TGTCATGCCA 


TCCGTAAGAT 


GCTTTTCTGT 


3399 


GACTGGTGAG 


TACTCAACCA 


AGTCATTCTG 


AGAATAGTGT 


ATGCGGCGAC 


CGAGTTGCTC 


3459 


TTGCCCGGCG 


TCAATACGGG 


ATAATACCGC 


GCCACATAGC 


AGAACTTTAA 


AAGTGCTCAT 


3519 


CATTGGAAAA 


CGTTCTTCGG 


GGCGAAAACT 


CTCAAGGATC 


TTACCGCTGT 


TGAGATCCAG 


3579 


TTCGATGTAA 


CCCACTCGTG 


CACCCAACTG 


ATCTTCAGCA 


TCTTTTACTT 


TCACCAGCGT 


3639 


TTCTGGGTGA 


GCAAAAACAG 


GAAGGCAAAA 


TGCCGCAAAA 


AAGGGAATAA 


GGGCGACACG 


3699 


GAAATGTTGA 


ATACTCATAC 


TCTTCCTTTT 


TCAATATTAT 


TGAAGCATTT 


ATCAGGGTTA 


3759 


TTGTCTCATG 


AGCGGATACA 


TATTTGAATG 


TATTTAGAAA 


AATAAACAAA 


TAGGGGTTCC 


3819 


GCGCACATTT 


CCCCGAAAAG 


TGCCACCTGA 


CGTC 






3853 



(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 14 amino acids 

(B) TYPE: amino acid 
( D ) TOPOLOGY : 1 inear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 6 : 

Arg Cys Thr Gly Gin lie Tyr Ala Leu Thr Leu lie lie Asp 
15 10 

(2) INFORMATION FOR SEQ ID NO : 7 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 7 : 

Leu Leu He Val He Asn Tyr Gly Val He Ser Ser 
15 10 
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(2) INFORMATION FOR SSQ ID NO : 8 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 71 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 

Pro lie Tyr Gly Val Pro Arg Tyr lie Thr Tyr Gly Lys Trp Pro Ala 
15 10 15 

Trp Leu Thr Ala Gin Arg Pro Pro Pro lie Asp Val Asn Asn Asp Val 
20 25 30 

Cys Ser His Ser Asn Ala Asn Arg Asp Phe Pro Leu Thr Ser Met Gly 
35 40 45 

Gly Leu Phe Thr Val Asn Cys Pro Leu Gly Ser Thr Ser Ser Val Ser 
50 55 60 

Tyr Ala Lys Tyr Ala Pro Tyr 
65 70 

(2) INFORMATION FOR SEQ ID NO : 9 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 9 : 

Arg Gin 
1 

(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 1 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:10: 

Arg 
1 

(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 45 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:ll: 

Met Ala Arg Leu Ala Leu Cys Pro Val His Asp Leu Met Gly Leu Ser 
15 10 15 

Tyr Leu Ala Val His Leu Arg lie Ser His Arg Tyr Tyr His Gly Asp 
20 25 30 

Ala Val Leu Ala Val His Gin Trp Ala Trp lie Ala Val 
35 40 45 

(2) INFORMATION FOR SEQ ID NO : 12 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 12 : 

Leu Thr Gly lie Ser Lys Ser Pro Pro His 
15 10 

(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:13: 

Arg Gin Trp Glu Phe Val Leu Ala Pro Lys Ser Thr Gly Leu Ser Lys 
15 10 15 

Met Ser 



(2) INFORMATION FOR SEQ ID NO : 14 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 amino acids 

(B) TYPE : amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 14 : 

Gin Leu Arg Pro lie Asp Ala Asn Gly Arg 
15 10 



47 

(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 15 : 

Arg Cys Thr Val Gly Gly Leu Tyr Lys Gin Ser Ser Leu Ala Asn 
15 10 15 



(2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 amino acids 

(B) TYPE: amino acid 
( D ) TOPOLOGY : 1 inear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 16 : 

Arg Thr His Cys Leu Leu Ala Tyr Arg Asn 
15 10 



(2) INFORMATION FOR SEQ ID NO : 17 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1 amino acids 

(B) TYPE: amino acid 
{ D ) TOPOLOGY : 1 inear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 

Leu 
1 



(2) INFORMATION FOR SEQ ID NO: 18: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 4 amino acids 

(B) TYPE : amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 

Cys His Leu Asn 
1 

(2) INFORMATION FOR SEQ ID NO: 19: 
(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 4 02 6 base pairs 

(B) TYPE: nucleic acid 

(C) STRAKDSDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(ix) FEATURE: 

(A) NAME/KEY : CDS 

(B) LOCATION: 209.. 250 

(ix) FEATURE: 

(A) NAME /KEY : CDS 

(B) LOCATION: 254.. 289 

(ix) FEATURE: 

(A) NAME / KEY : CDS 

(3) LOCATION: 293.. 505 

(ix) FEATURE: 

(A) NAME /KEY : CDS 

(B) LOCATION: 509.. 514 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 518 520 

(ix) FEATURE : 

(A) NAME/KEY: CDS 

(B) LOCATION: 524.. 658 

(ix) FEATURE: 

(A) NAME /KEY : CDS 

(B) LOCATION: 662.. 691 

(ix) FEATURE: 

(A) NAME / KEY : CDS 

(B) LOCATION: 695.. 748 

( ix ) FEATURE : 

(A) NAME /KEY : CDS 

(B) LOCATION: 752.. 781 

(ix) FEATURE: 

(A) NAME / KEY : CDS 

(B) LOCATION: 785.. 829 

(ix) FEATURE: 

(A) NAME /KEY: CDS 

(B) LOCATION: 833 .. 862 

(ix) FEATURE: 

(A) NAME /KEY : CDS 

(B) LOCATION: 1305.. 1307 

(ix) FEATURE: 

(A) NAME /KEY: CDS 

(B) LOCATION: 1311.. 1322 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19 



1 
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GACGGAT CGG GAGATCTCCC GATCCCCTAT GGTCGACTCT CAGTACAATC TGCTCTGATG 6 0 

CCGCATAGTT AAGCCAGTAT CTGCTCCCTG CTTGTGTGTT GGAGGTCGCT GAGTAGTGCG 12 0 

CGAGCAAAAT TTAAG CTAC A ACAAGGCAAG GCTTGACCGA CAATTGCATG AAGAATCTGC 18 0 

TTAGGGTTAG GCGTTTTGCG CTGCTTCG CGA TGT ACG GGC CAG ATA TAC GCG 232 

Arg Cys Thr Gly Gin lie Tyr Ala 
1 5 

TTG ACA TTG ATT ATT GAC TAG TTA TTA ATA GTA ATC AAT TAC GGG GTC 280 
Leu Thr Leu He He Asp Leu Leu He Val He Asn Tyr Gly Val 

10 15 

ATT AGT TCA TAG CCC ATA TAT GGA GTT CCG CGT TAC ATA ACT TAC GGT 328 
He Ser Ser Pro He Tyr Gly Val Pro Arg Tyr lie Thr Tyr Gly 

10 1 5 10 

AAA TGG CCC GCC TGG CTG ACC GCC CAA CGA CCC CCG CCC ATT GAC GTC 3 76 

Lvs Trp Pro Ala Trp Leu Thr Ala Gin Arg Pro Pro Pro He Asp Val 
15 20 25 

AAT AAT GAC GTA TGT TCC CAT AGT AAC GCC AAT AGG GAC TTT CCA TTG 424 
Asn Asn Asp Val Cys Ser His Ser Asn Ala Asn Arg Asp Phe Pro Leu 
30 35 40 

ACG TCA ATG GGT GGA CTA TTT ACG GTA AAC TGC CCA CTT GGC AGT ACA 4 72 

Thr Ser Met Gly Gly Leu Phe Thr Val Asn Cys Pro Leu Gly Ser Thr 
45 50 55 60 

TCA AGT GTA TCA TAT GCC AAG TAC GCC CCC TAT TGA CGT CAA 514 
Ser Ser Val Ser Tyr Ala Lys Tyr Ala Pro Tyr Arg Gin 

65 70 1 

TGA CGG TAA ATG GCC CGC CTG GCA TTA TGC CCA GTA CAT GAC CTT ATG 562 
Arq Met Ala Arg Leu Ala Leu Cys Pro Val His Asp Leu Met 

11 5 10 

GGA CTT TCC TAC TTG GCA GTA CAT CTA CGT ATT AGT CAT CGC TAT TAC 610 
Glv Leu Ser Tyr Leu Ala Val His Leu Arg He Ser His Arg Tyr Tyr 
15 20 25 

CAT GGT GAT GCG GTT TTG GCA GTA CAT CAA TGG GCG TGG ATA GCG GTT 658 
His Gly Asp Ala Val Leu Ala Val His Gin Trp Ala Trp He Ala Val 
30 ^ 35 40 45 

TGA CTC ACG GGG ATT TCC AAG TCT CCA CCC CAT TGA CGT CAA TGG GAG 706 
Leu Thr Gly He Ser Lys Ser Pro Pro His Arg Gin Trp Glu 

1 5 10 1 

TTT GTT TTG GCA CCA AAA TCA ACG GGA CTT TCC AAA ATG TCG 74 8 

Phe Val Leu Ala Pro Lys Ser Thr Gly Leu Ser Lys Met Ser 
5 10 15 

TAA CAA CTC CGC CCC ATT GAC GCA AAT GGG CGG TAG GCG TGT ACG GTG 7 96 

Gin Leu Arg Pro He Asp Ala Asn Gly Arg Ala Cys Thr Val 

l 5 10 1 

GGA GGT CTA TAT AAG CAG AGC TCT CTG GCT AAC TAG AGA ACC CAC TGC 844 
Gly Gly Leu Tyr Lys Gin Ser Ser Leu Ala Asn Arg Thr His Cys 

5 10 15 1 
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TTA CTG GCT TAT CGA AAT TAATACGACT C AC TATAGGG AGACCCAAGC 8 92 
Leu Leu Ala Tyr Arg Asn 
5 10 

TTCGCGCGGG TACCACTCTC TTCCGCATCG CTGTCTGCGA GGGCCAGCTG TTGGGCTCGC 952 

GGTTGAGGAC AAACTCTTCG CGGTCTTTCC AGTACTCTTG GATCGGAAAC CCGTCGGCCT 1012 

CCGAACGGTA CTCCGCCACC GAGGGAC CTG AGCGAGTCCG CATCGACCGG ATCGGAAAAC 1072 

CTCTCGAGGA ACTGAAAAAC CAGAAAGTTA ACTGGTAAGT TTAGTCTTTT TGTCTTTTTA 1132 

TTTCAGGTCC CGGATCCGGT GGTGGTGCAA ATCAAAGAAC TGCTCCTCAG TGGATGTTGC 1192 

CTTTACTTCT AGGCCTGTAC GGAAGTGTTA CTTCTGCTCT AAAAGCTGCG GAATTGTACC 12 52 

CGCGGCCGCT GCAGTCTAGA CGAATTCGCG TACGATATCG ATGGGCCCTA TT CTA 13 07 

Leu 
1 



TAG TGT CAC CTA AAT GCTAGAGCTC GCTGATCAGC CTCGACTGTG C CTT CTAJ3TT 13 62 

Cys His Leu Asn 
1 



GCCAGCCATC 


TGTTGTTTGC 


CCCTCCCCCG 


TGCCTTCCTT 


GACCCTGGAA 


GGTGC CACTC 


1422 


CCACTGTCCT 


TTCCTAATAA 


AATGAGGAAA 


TTGCATCGCA 


TTGTCTGAGT 


AGGTGTCATT 


1482 


CTATTCTGGG 


GGGTGGGGTG 


GGGCAGGACA 


GCAAGGGGGA 


GGATTGGGAA 


G AC AAT AG C C 


1542 


GAAATGACCG 


ACCAAGCGAC 


GCCCAACCTG 


C CATC AC GAG 


ATTTCGATTC 


CACCGCCGCC 


1602 


TTCTATGAAA 


GGTTGGGCTT 


CGGAATCGTT 


TTCCGGGACG 


CCGGCTGGAT 


GATCCTCCAG 


1662 


CGCGGGGATC 


TCATGCTGGA 


GTTCTTCGCC 


CACCCCAACT 


TGTTTATTGC 


AGCTTATAAT 


1722 


GGTTACAAAT 


AAAGCAATAG 


CATCACAAAT 


TTCACAAATA 


AAGCATTTTT 


TTCACTGCAT 


1782 


TCTAGTTGTG 


GTTTGTCCAA 


ACTCATCAAT 


GTATCTTATC 


ATGTCTGTAT 


ACCGTCGACC 


1842 


TCTAGCTAGA 


GCTTGGCGTA 


ATCATGGTCA 


TAGCTGTTTC 


CTGTGTGAAA 


TTGTTATCCG 


1902 


CTCACAATTC 


CACACAACAT 


ACGAGCCGGA 


AGCATAAAGT 


GTAAAGCCTG 


GGGTGCCTAA 


1962 


TGAGTGAGCT 


AACTCACATT 


AATTGCGTTG 


CGCTCACTGC 


CCGCTTTCCA 


GTCGGGAAAC 


2022 


CTGTCGTGCC. 


AGCTGCATTA 


ATGAATCGGC 


CAACGCGCGG 


GGAGAGGCGG 


TTTGCGTATT 


2082 


GGGCGCTCTT 


CCGCTTCCTC 


GCTCACTGAC 


TCGCTGCGCT 


CGGTCGTTCG 


GCTGCGGCGA 


2142 


GCGGTATCAG 


CTCACT C AAA 


GGCGGTAATA 


CGGTTATCCA 


CAGAATCAGG 


GGATAACGCA 


2202 


GGAAAGAACA 


TGTGAGCAAA 


AGGCCAGCAA 


AAGGCCAGGA 


ACCGTAAAAA 


GGCCGCGTTG 


2262 


CTGGCGTTTT 


TCCATAGGCT 


CCGCCCCCCT 


GACGAGCATC 


ACAAAAATCG 


ACGCTCAAGT 


2322 


CAGAGGTGGC 


GAAAC CCGAC 


AGGACTATAA 


AGATACCAGG 


CGTTTCCCCC 


TGGAAGCTCC 


2382 


CTCGTGCGCT 


CTCCTGTTCC 


GACCCTGCCG 


CTTACCGGAT 


ACCTGTCCGC 


CTTTCTCCCT 


2442 


TCGGGAAGCG 


TGGCGCTTTC 


TCAATGCTCA 


CGCTGTAGGT 


ATCTGAGTTC 


GGTGTAGGTC 


2502 
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GTTCGCTC C A 


AGCTGGGCTG 


TGTGCACGAA 


CCCCCCGTTC 


AGCCCGACCG 


CTGCGCCTTA 


2562 


TCCGGTAACT 


ATCGTCTTGA 


GTCCAACCCG 


GTAAGAC AC G 


ACTTATCGCC 


ACTGGCAGCA 


2622 


GCCACTGGTA 


ACAGGATTAG 


CAGAGCGAGG 


TATGTAGGCG 


GTGCTACAGA 


GTTCTTGAAG 


2682 


TGGTGGCCTA 


ACTACGGCTA 


CACTAGAAGG 


ACAGTATTTG 


GTATCTGCGC 


TCTGCTGAAG 


2742 


CCAGTTACCT 


TCGGAAAAAG 


AGTTGGTAGC 


TCTTGATCCG 


GCAAACAAAC 


CACCGCTGGT 


2802 


AGCGGTGGTT 


TTTTTGTTTG 


CAAGCAGCAG 


ATTACGCGCA 


GAAAAAAAGG 


AT CTCAAGAA 


2862 


GATCCTTTGA 


TCTTTTCTAC 


GGGGTCTGAC 


GCTCAGTGGA 


ACGAAAACTC 


ACGTTAAGGG 


2922 


ATTTTGGTCA 


TGAGATTATC 


AAAAAGGATC 


TTCACCTAGA 


TCCTTTTAAA 


TTAAAAATGA 


2982 


AGTTTTAAAT 


CAAT CTAAAG 


TATATATGAG 


TAAACTTGGT 


CTGACAGTTA 


CCAATGCTTA 


3042 


ATCAGTGAGG 


CACCTATCTC 


AGCGATCTGT 


CTATTTCGTT 


CATCCATAGT 


TGCCTGACTC 


3102 


CCCGTCGTGT 


AGATAACTAC 


GATACGGGAG 


GGCTTACCAT 


CTGGCCCCAG 


TGCTGCAATG 


3162 


ATAC CGCGAG 


ACCCACGCTC 


ACCGGCTCCA 


GATTTATCAG 


CAATAAACCA 


GCCAGCCGGA 


3222 


AGGGCCGAGC 


GCAGAAGTGG 


TCCTGCAACT 


TTATCCGCCT 


CCATCCAGTC 


TATTAATTGT 


3282 


TGC CGGGAAG 


CTAGAGTAAG 


TAGTTCGCCA 


GTTAATAGTT 


TGCGCAACGT 


TGTTGCCATT 


3342 


GCTACAGGCA 


TCGTGGTGTC 


ACGCTCGTCG 


TTTGGTATGG 


CTTCATTCAG 


CTCCGGTTCC 


3402 


CAACGATCAA 


GGCGAGTTAC 


ATGATCCCCC 


ATGTTGTGCA 


AAAAAGCGGT 


TAGCTCCTTC 


3462 


GGTCCTCCGA 


TCGTTGTCAG 


AAGTAAGTTG 


GCCGCAGTGT 


TATCACTCAT 


GGTTATGGCA 


3522 


GCACTGCATA 


ATTCTCTTAC 


TGTCATGCCA 


TCCGTAAGAT 


GCTTTTCTGT 


GACTGGTGAG 


3582 


TACTCAACCA 


AGTCATTCTG 


AGAATAGTGT 


ATGCGGCGAC 


CGAGTTGCTC 


TTGCCCGGCG 


3642 


TCAATACGGG 


ATAATAC CGC 


GCCACATAGC 


AGAACTTTAA 


AAGTGCTCAT 


CATTGGAAAA 


3702 


CGTTCTTCGG 


GGCGAAAACT 


CTCAAGGATC 


TTACCGCTGT 


TGAGATC C AG 


TTCGATGTAA 


3762 


CCCACTCGTG 


CACCCAACTG 


ATCTTCAGCA 


TCTTTTACTT 


TCAC CAGCGT 


TTCTGGGTGA 


3822 


GCAAAAACAG 


GAAGGCAAAA 


TGCCGCAAAA 


AAGGGAATAA 


GGGCGACACG 


GAAATGTTGA 


3882 


ATACTCATAC 


TCTTCCTTTT 


TCAATATTAT 


TGAAGCATTT 


ATCAGGGTTA 


TTGTCTCATG 


3942 


AGCGGATACA 


TATTTGAATG 


TATTTAGAAA 


AATAAACAAA 


TAGGGGTTCC 


GCGCACATTT 


4002 


CCCCGAAAAG 


TGCCACCTGA 


CGTC 








4026 



(2) INFORMATION FOR SEQ ID NO: 20: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 14 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20: 

Arg Cys Thr Gly Gin lie Tyr Ala Leu Thr Leu lie lie Asp 
15 10 



(2) INFORMATION FOR SEQ ID NO: 21: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21: 

Leu Leu lie Val lie Asn Tyr Gly Val lie Ser Ser 
15 10 



(2) INFORMATION FOR SEQ ID NO:22: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 71 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:22: 

Pro lie Tyr Gly Val Pro Arg Tyr lie Thr Tyr Gly Lys Trp Pro Ala 
15 10 15 

Trp Leu Thr Ala Gin Arg Pro Pro Pro lie Asp Val Asn Asn Asp Val 
20 25 30 

Cys Ser His Ser Asn Ala Asn Arg Asp Phe Pro Leu Thr Ser Met Gly 
35 40 45 

Gly Leu Phe Thr Val Asn Cys Pro Leu Gly Ser Thr Ser Ser Val Ser 
50 55 60 

Tyr Ala Lys Tyr Ala Pro Tyr 
65 70 



(2) INFORMATION FOR SEQ ID NO: 23: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 23: 

Arg Gin 
1 
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(2) INFORMATION FOR SEQ ID NO: 24: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:24: 

Arg 
1 



(2) INFORMATION FOR SEQ ID NO: 25: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 5 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 25: 

Met Ala Arg Leu Ala Leu Cys Pro Val His Asp Leu Met Gly Leu Ser 
15 10 15 

Tyr Leu Ala Val His Leu Arg lie Ser His Arg Tyr Tyr His Gly Asp 
20 25 30 

Ala Val Leu Ala Val His Gin Trp Ala Trp lie Ala Val 
35 40 45 



(2) INFORMATION FOR SEQ ID NO: 26: 

{i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 amino acids 

(B) TYPE: amino acid 
( D ) TOPOLOGY : 1 inear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:26: 

Leu Thr Gly He Ser Lys Ser Pro Pro His 
15 10 



(2) INFORMATION FOR SEQ ID NO: 27: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 27: 
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Arg Gin Trp Glu Phe Val Leu Ala Pro Lys Ser Thr Gly Leu Ser Lvs 
1*5 10 15 



Met Ser 



(2) INFORMATION FOR SEQ ID NO: 28: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 28: 

Gin Leu Arg Pro He Asp Ala Asn Gly Arg 
15 10 

(2) INFORMATION FOR SEQ ID NO: 29: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:29: 

Ala Cys Thr Val Gly Gly Leu Tyr Lys Gin Ser Ser Leu Ala Asn 
15 10 15 

(2) INFORMATION FOR SEQ ID NO: 30: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii)' MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 30: 

Arg Thr His Cys Leu Leu Ala Tyr Arg Asn 
15 10 

(2) INFORMATION FOR SEQ ID NO: 31: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 31: 



5 5 



Leu 
1 



(2) INFORMATION FOR SEQ ID NO: 32: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 32 

Cys His Leu Asn 
1 

(2) INFORMATION FOR SEQ ID NO: 33: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4249 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(ix) FEATURE: 

(A) NAME /KEY : CDS 

(B) LOCATION: 209.. 250 

(ix) FEATURE; 

(A) NAME / KEY : CDS 

(B) LOCATION: 254.. 289 

(ix) FEATURE: 

(A) NAME /KEY : CDS 

(B) LOCATION: 293 . . 505 

(ix) FEATURE: 

(A) NAME /KEY : CDS 

(B) LOCATION: 509.. 514 

(ix) FEATURE: 

(A) NAME /KEY : CDS 

(B) LOCATION: 518 .. 520 

(ix) FEATURE: 

(A) NAME / KEY : CDS 

(B) LOCATION: 524 .. 658 

(ix) FEATURE: 

(A) NAME /KEY : CDS 

(B) LOCATION: 662 691 

(ix) FEATURE: 

(A) NAME /KEY: CDS 

(B) LOCATION: 695.. 748 



(ix) FEATURE: 



t 
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(A) NAME /KEY : CDS 

(B) LOCATION: 752.. 781 

(ix) FEATURE : 

(A) NAME /KEY : CDS 

(B) LOCATION: 785.-829 

(ix) FEATURE: 

(A) NAME /KEY : CDS 

(B) LOCATION: 833,. 862 

(ix) FEATURE: 

(A) NAME /KEY : CDS 

(B) LOCATION: 1528.. 1530 

(ix) FEATURE: 

(A) NAME / KEY : CDS 

(B) LOCATION: 1534.. 1545 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 33: 

GACGGATCGG GAGATCTCCC GATCCCCTAT GGTCGACTCT CAGTACAATC TGCTCTGATG 60 

CCGCATAGTT AAGCCAGTAT CTGCTCCCTG CTTGTGTGTT GGAGGTCGCT GAGTAGTGCG 12 0 

CGAGCAAAAT TTAAGCTACA ACAAGGCAAG GCTTGACCGA CAATTGCATG AAGAATCTGC 18 0 

TTAGGGTTAG GCGTTTTGCG CTGCTTCG CGA TGT ACG GGC CAG ATA TAC GCG 2 32 

Arg Cys Thr Gly Gin lie Tyr Ala 
1 5 

TTG ACA TTG ATT ATT GAC TAG TTA TTA ATA GTA ATC AAT TAC GGG GTC 2 80 

Leu Thr Leu lie lie Asp Leu Leu lie Val lie Asn Tyr Gly Val 

10 15 

ATT AGT TCA TAG CCC ATA TAT GGA GTT CCG CGT TAC ATA ACT TAC GGT 3 28 

lie Ser Ser Pro lie Tyr Gly Val Pro Arg Tyr lie Thr Tyr Gly 

10 1 5 10 

AAA TGG CCC GCC TGG CTG ACC GCC CAA CGA CCC CCG CCC ATT GAC GTC 3 76 

Lys Trp Pro Ala Trp Leu Thr Ala Gin Arg Pro Pro Pro lie Asp Val 
15 20 25 

AAT AAT GAC GTA TGT TCC CAT AGT AAC GCC AAT AGG GAC TTT CCA TTG 4 24 

Asn Asn Asp Val Cys Ser His Ser Asn Ala Asn Arg Asp Phe Pro Leu 
30 35 40 

ACG TCA ATG GGT GGA CTA TTT ACG GTA AAC TGC CCA CTT GGC AGT ACA 4 72 

Thr Ser Met Gly Gly Leu Phe Thr Val Asn Cys Pro Leu Gly Ser Thr 
45 50 55 60 

TCA AGT GTA TCA TAT GCC AAG TAC GCC CCC TAT TGA CGT CAA 514 
Ser Ser Val Ser Tyr Ala Lys Tyr Ala Pro Tyr Arg Gin 

65 70 1 

TGA CGG TAA ATG GCC CGC CTG GCA TTA TGC CCA GTA CAT GAC CTT ATG 5 62 

Arg Met Ala Arg Leu Ala Leu Cys Pro Val His Asp Leu Met 

11 5 10 

GGA CTT TCC TAC TTG GCA GTA CAT CTA CGT ATT AGT CAT CGC TAT TAC 610 
Gly Leu Ser Tyr Leu Ala Val His Leu Arg lie Ser His Arg Tyr Tyr 
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15 20 25 

CAT GGT GAT GCG GTT TTG GCA GTA CAT CAA TGG GCG TGG ATA GCG GTT 6 53 
His Gly Asp Ala Val Leu Ala Val His Gin Trp Ala Trp lis Ala Val 
30 35 40 45 

TGA CTC ACG GGG ATT TCC AAG TCT CCA CCC CAT TGA CGT CAA TGG GAG 7 06 
Leu Thr Gly lie Ser Lys Ser Pro Pro His Arg Gin Trp Glu 
1 5 10 l 

TTT GTT TTG GCA CCA AAA TCA ACG GGA CTT TCC AAA ATG TCG 748 
Phe Val Leu Ala Pro Lys Ser Thr Gly Leu Ser Lys Met Ser 
5 10 15 

TAA CAA CTC CGC CCC ATT GAC GCA AAT GGG CGG TAG GCG TGT ACG GTG 796 
Gin Leu Arg Pro He Asp Ala Asn Gly Arg Ala Cys Thr Val 
1 5 10 1 

GGA GGT CTA TAT AAG CAG AGC TCT CTG GCT AAC TAG AGA ACC CAC TGC 844 
Gly Gly Leu Tyr Lys Gin Ser Ser Leu Ala Asn Arg Thr His Cys 
5 10 15 1 

TTA CTG GCT TAT CGA AAT TAATACGACT CACTATAGGG AGACCCAAGC 8 92 
Leu Leu Ala Tyr Arg Asn 
5 10 

TTCGCGCGGG TACCACTCTC TTCCGCATCG CTGTCTGCGA GGGCCAGCTG TTGGGCTCGC 952 

GGTTGAGGAC AAACTCTTCG CGGTCTTTCC AGTACTCTTG GATCGGAAAC CCGTCGGCCT 1012 

CCGAACGGTA CTCCGCCACC GAGGGAC CTG AGCGAGTCCG CATCGACCGG ATCGGAAAAC 10 72 

CTCTCGAGGA ACTGAAAAAC CAGAAAGTTA ACTGGTAAGT TTAGTCTTTT TGTCTTTTTA 1132 

TTTCAGGTCC CGGATCTGAG TTAGGGCGGG ACATGGGCGG AGTTAGGGGC GGGACTATGG 1192 

TTGCTGACTA ATTGAGATGC ATGCTTTGCA TACTTCTGCC TGCTGGGGAG CCTGGGGACT 12 52 

TTCCACACCT GGTTGCTGAC TAATTGAGAT GCATGCTTTG CATACTTCTG CCTGCTGGGG 1312 

AGCCTGGGGA CTTTCCACAC CCTAACTGAC ACACATTCCA CAGCTGGTTC TTTCAGATCC 13 72 

GGTGGTGGTG CAAATCAAAG AACTGCTCCT CAGTGGATGT TGCCTTTACT TCTAGGCCTG 1432 

TACGGAAGTG TTACTTCTGC TCTAAAAGCT GCGGAATTGT ACCCGCGGCC GCTGCAGTCT 14 92 

AGACGAATTC GCGTACGATA TCGATGGGCC CTATT CTA TAG TGT CAC CTA AAT 154 5 

Leu Cys His Leu Asn 

1 1 

GCTAGAGCTC GCTGATCAGC CTCGACTGTG CCTTCTAGTT GCCAGCCATC TGTTGTTTGC 16 0 5 

CCCTCCCCCG TGCCTTCCTT GAC C CTGGAA GGTGCCACTC CCACTGTCCT TTCCTAATAA 166 5 

AATGAGGAAA TTGCATCGCA TTGTCTGAGT AGGTGTCATT CTATTCTGGG GGGTGGGGTG 172 5 

GGGCAGGACA GCAAGGGGGA GGATTGGGAA GACAATAGCC GAAATGACCG AC CAAGCGAC 178 5 

GCCCAACCTG C CAT C ACGAG ATTTCGATTC CACCGCCGCC TTCTATGAAA GGTTGGGCTT 184 5 

CGGAATCGTT TTCCGGGACG CCGGCTGGAT GATCCTCCAG CGCGGGGATC TCATGCTGGA 190 5 
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GTTCTTCGCC CACCCCAACT TGTTTATTGC AGCTTATAAT GGTTACAAAT AAAG C AAT AG 13 65 

CATCACAAAT TTCACAAATA AAGCATTTTT TTCACTGCAT TCTAGTTGTG GTTTGTCCAA 2 02 5 

AGTCATCAAT GTATCTTATC ATGTCTGTAT ACCGTCGACC TCTAGCTAGA GCTTGGCGTA 2 03 5 

ATCATGGTCA TAGCTGTTTC CTGTGTGAAA TTGTTATCCG CTCACAATTC CACACAACAT 214 5 

ACGAGCCGGA AGCATAAAGT GTAAAGCCTG GGGTGCCTAA TGAGTGAGCT AACTCACATT 220 5 

AATTGCGTTG CGCTCACTGC CCGCTTTCCA GTCGGGAAAC CTGTCGTGCC AGCTGCATTA 22 65 

ATGAATCGGC CAACGCGCGG GGAGAGGCGG TTTGCGTATT GGGCGCTCTT CCGCTTCCTC 2325 

GGTCACTGAC TCGCTGCGCT CGGTCGTTCG GCTGCGGCGA GCGGTATCAG CTCACTCAAA 23 85 

GGCGGTAATA CGGTTATCCA CAGAATCAGG GGATAACGCA GGAAAGAACA TGTGAGCAAA 2445 

AGGCCAGCAA AAGGCCAGGA ACCGTAAAAA GGCCGCGTTG CTGGCGTTTT TCCATAGGCT 25 05 

CCGCCCCCCT GACGAGCATC ACAAAAATCG ACGCTCAAGT CAGAGGTGGC GAAACCCGAC 2565 

AGGACTATAA AGATACCAGG CGTTTCCCCC TGGAAGCTCC CTCGTGCGCT CTCCTGTTCC 2625 

GACCCTGCCG CTTACCGGAT ACCTGTCCGC CTTTCTCCCT TCGGGAAGCG TGGCGCTTTC 268 5 

TCAATGCTCA CGCTGTAGGT ATCTCAGTTC GGTGTAGGTC GTTCGCTCCA AGCTGGGCTG 2 745 

TGTGCACGAA CCCCCCGTTC AGCCCGACCG CTGCGCCTTA TCCGGTAACT ATCGTCTTGA 2805 

GTCCAACCCG GTAAGACACG ACTTATCGCC ACTGGCAGCA GCCACTGGTA ACAGGATTAG 2865 

CAGAGCGAGG TATGTAGGCG GTGCTACAGA GTTCTTGAAG TGGTGGC CTA ACTACGGCTA 2 92 5 

CACTAGAAGG ACAGTATTTG GTATCTGCGC TCTGCTGAAG CCAGTTACCT TCGGAAAAAG 2 98 5 

AGTTGGTAGC TCTTGATCCG GCAAACAAAC CACCGCTGGT AGCGGTGGTT TTTTTGTTTG 3 045 

CAAGCAGCAG ATTACGCGCA GAAAAAAAGG ATCTCAAGAA GATCCTTTGA TCTTTTCTAC 3105 

GGGGTCTGAC GCTCAGTGGA ACGAAAACTC ACGTTAAGGG ATTTTGGTCA TGAGATTATC 3165 

AAAAAGGATC TTCACCTAGA TGCTTTTAAA TTAAAAATGA AGTTTTAAAT CAATCTAAAG 3 22 5 

TATATATGAG TAAACTTGGT CTGACAGTTA CCAATGCTTA ATCAGTGAGG CACCTATCTC 32 85 

AGCGATCTGT CTATTTCGTT CATCCATAGT TGCCTGACTC CCCGTCGTGT AGATAACTAC 3 345 

GATACGGGAG GGCTTACCAT CTGGCCCCAG TGCTGCAATG ATACCGCGAG ACCCACGCTC 34 0 5 

ACCGGCTCCA GATTTATCAG CAATAAACCA GCCAGCCGGA AGGGCCGAGC GCAGAAGTGG 346 5 

TCCTGCAACT TTATCCGCCT CCATCCAGTC TATTAATTGT TGCCGGGAAG CTAGAGTAAG 3 52 5 

TAGTTCGCCA GTTAATAGTT TGCGCAACGT TGTTGCCATT GCTACAGGCA TCGTGGTGTC 3 585 

ACGCTCGTCG TTTGGTATGG CTTCATTCAG CTCCGGTTCC CAACGATCAA GGCGAGTTAC 3 64 5 

ATGATCCCCC ATGTTGTGCA AAAAAGCGGT TAGCTCCTTC GGTCCTCCGA TCGTTGTCAG 3 705 

AAGTAAGTTG GCCGCAGTGT TATCACTCAT GGTTATGGCA GCACTGCATA ATTCTCTTAC 3 765 



1 
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TGTCATGCCA 


TCCGTAAGAT 


GCTTTTCTGT 


GACTGGTGAG 


TACTCAACCA 


AGTCATTCTG 


3825 


AGAATAGTGT 


ATGCGGCGAC 


CGAGTTGCTC 


TTGCCCGGCG 


TCAATACGGG 


ATAATACCGC 


3835 


GCCACATAGC 


AGAAC7TTAA 


AAGTGCTCAT 


CATTGGAAAA 


CGTTCTTCGG 


GGCGAAAACT 


3945 


CTCAAGGATC 


TTACCGCTGT 


TGAGATCCAG 


TTCGATGTAA 


CCCACTCGTG 


CACCCAACTG 


4005 


ATCTTCAGCA 


TCTTTTACTT 


TCACCAGCGT 


TTCTGGGTGA 


GCAAAAACAG 


GAAGGCAAAA 


4065 


TGCCGCAAAA 


AAGGGAATAA 


GGGCGACACG 


GAAATGTTGA 


ATACTCATAC 


TCTTCCTTTT 


4125 


TCAATATTAT 


TGAAGCATTT 


ATCAGGGTTA 


TTGTCTCATG 


AGCGGATACA 


TATTTGAATG 


4185 


TATTTAGAAA 


AATAAACAAA 


TAGGGGTTCC 


GCGCACATTT 


CCCCGAAAAG 


TGCCACCTGA 


4245 


CGTC 
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(2) INFORMATION FOR SEQ ID NO: 34: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 14 amino acids 

(B) TYPE : amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 34: 

Arg Cys Thr Gly Gin lie Tyr Ala Leu Thr Leu lie lie Asp 
15 10 

(2) INFORMATION FOR SEQ ID NO: 35: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 amino acids 

( B ) TYPE : amino ac id 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:35: 

Leu Leu lie Val lie Asn Tyr Gly Val lie Ser Ser 
15 10 

(2) INFORMATION FOR SEQ ID NO: 36: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 71 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 36; 

Pro lie Tyr Gly Val Pro Arg Tyr lie Thr Tyr Gly Lys Trp Pro Ala 
15 10 15 
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Trp Leu Thr Ala Gin Arg Pro Pro Pro lie Asp Val Asn Asn Asp Val 
20 25 30 

Cys Ser His Ser Asn Ala Asn Arg Asp Phe Pro Leu Thr Ser Met Gly 
35 40 45 

Gly Leu Phe Thr Val Asn Cys Pro Leu Gly Ser Thr Ser Ser Val Ser 
50 55 60 

Tyr Ala Lys Tyr Ala Pro Tyr 
65 70 



(2) INFORMATION FOR SEQ ID NO: 37: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 37 

Arg Gin 
1 

(2) INFORMATION FOR SEQ ID NO: 38: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 38 

Arg 
1 



(2) INFORMATION FOR SEQ ID NO: 39: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 5 amino acids 
<B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 39: 

Met Ala Arg Leu Ala Leu Cys Pro Val His Asp Leu Met Gly Leu Ser 
15 10 15 

Tyr Leu Ala Val His Leu Arg lie Ser His Arg Tyr Tyr His Gly Asp 
20 25 30 

Ala Val Leu Ala Val His Gin Trp Ala Trp lie Ala Val 
35 40 45 



51 



(2) INFORMATION FOR SEQ ID NO: 40: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:40: 

Leu Thr Gly lie Ser Lys Ser Pro Pro His 
15 10 



(2) INFORMATION FOR SEQ ID NO: 41: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 amino acids 

(B) TYPE: amino acid 
( D) TOPOLOGY : 1 inear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:41: 

Arg Gin Trp Glu Phe Val Leu Ala Pro Lys Ser Thr Gly Leu Ser Lys 
15 10 15 

Met Ser 



(2) INFORMATION FOR SEQ ID NO: 42: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:42: 

Gin Leu Arg Pro lie Asp Ala Asn Gly Arg 
15 10 



(2) INFORMATION FOR SEQ ID NO:43: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:43: 

Ala Cys Thr Val Gly Gly Leu Tyr Lys Gin Ser Ser Leu Ala Asn 
15 10 15 
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(2) INFORMATION FOR SHQ ID NO: 44: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 44 

■ Arg Thr His Cys Leu Leu Ala Tyr Arg Asn 
15 10 



(2) INFORMATION FOR SEQ ID NO: 45: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 45 

Leu 
1 



(2) INFORMATION FOR SEQ ID NO: 46: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 amino acids 

(B) TYPE: amino acid 
( D ) TOPOLOGY : 1 inear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 46 



Cys His Leu Asn 
1 



